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Introduction 

Although  mammography  significantly  reduces  its  toll,  breast  cancer  remains  a  leading 
cause  of  cancer  mortality  in  the  U.S.  Many  breast  cancers  are  advanced  at  the  time  of 
diagnosis,  even  among  women  participating  in  screening.  The  discovery  of  molecular 
markers  associated  with  breast  cancer  potentially  increases  our  ability  to  diagnose  early 
stage  tumors.  The  translational  goal  of  this  Center  of  Excellence  (CoE)  is  a  panel  of 
serum  markers  with  decision  rules  for  its  use  to  improve  the  performance  of  breast 
cancer  screening  that  includes  mammography.  The  primary  aims  of  this  study  are:  1 )  to 
validate  and  refine  the  ability  of  candidate  biomarkers  measurable  in  blood  products  to 
predict  disease  status;  2)  to  evaluate  panels  of  serum  markers  for  use  as  an  adjunct  to 
mammography,  to  detect  all  breast  cancer  at  a  highly  curable  stage;  and  3)  to  identify 
the  molecular  signatures  of  benign,  pre-invasive  and  invasive  breast  tissue  and  explore 
their  associations  with  serum  markers  in  the  panel.  Several  years  ago  it  became 
apparent  that  there  were  not  enough  candidate  markers  ready  for  validation.  Since  that 
time  we  have  made  an  effort  to  find  and  prepare  new  markers  for  evaluation  in  the  CoE. 
This  report  details  research  accomplishments  during  the  active  project  period  from 
February,  2004-September,  2009.  Although  CoE  funding  began  on  September  23  2002, 
the  DOD  did  not  grant  us  human  subjects  approval  until  February  (Mammography 
Tumor  Registry)  and  May  (Breast  Cancer  Early  Discovery  Study)  of  2004.  In  light  of  this 
delay  we  were  granted  a  2  year  extension  to  complete  our  work  and  the  project  officially 
ended  on  September  22,  2009. 


Body 

During  the  course  of  this  study,  CoE  researchers  have  focused  their  efforts  and  made 
progress  in  three  areas: 

Biomarker  Discovery:  identification  of  promising  biomarkers,  assay  development. 
Biomarker  Evaluation:  evaluation  of  individual  and  assay  panels  for  their  ability  to 
detect  breast  cancer. 

Resource  Development:  tissue  and  blood  repositories  for  future  research 
involving  biomarker  discovery  and  validation. 

During  the  last  half  of  this  project  investigators  have  created  and  utilized  a  Breast 
Discovery  Set  (BDS)  used  by  CoE  investigators  and  collaborators  to  evaluate  new 
markers  and  conduct  the  molecular  profiling  work  described  in  project  Aim  3,  and  the 
Panel  Development  and  Validation  Sets  (PDS,  PVS),  used  to  evaluate  candidates  in  a 
biomarker  panel  with  the  aim  of  improving  the  sensitivity  of  breast  cancer  screening 
including  mammography  (see  Tasks  10-13  for  more  detail).  Both  the  BDS  and  the  PVS 
may  be  made  available  to  outside  investigators  who  have  promising  biomarkers  or 
discovery  platforms  through  an  RFA  mechanism. 

Subject  recruitment  and  specimen  collection  began  in  Seattle  at  Swedish  Medical  Center 
(SMC)  in  May,  2004  and  continued  through  September  18,  2009.  Mammography  data 
(assessment  codes,  follow  up  recommendations,  and  breast  density)  in  coordination  with 
family  history  collected  on  our  baseline  questionnaire  and  the  GAIL  model^  are  used  to 
determine  risk  of  breast  cancer  (high,  elevated  or  average  risk)  for  women  in  the 
mammography  cohort^’ The  surgical  cohort  consists  of  women  scheduled  to  undergo 
breast  surgery  for  malignant  or  nonmalignant  conditions.  Surgical  recruitment  at  Cedars- 
Sinai  Medical  Center  in  Los  Angeles  was  led  by  Drs.  Beth  and  Scott  Karlan  and  took 
place  from  July,  2005  through  April,  2008.  The  clinical  protocols  for  each  site  have  been 
standardized  as  much  as  possible  with  shared  data  collection  instruments  and  a  web- 
enabled  data  entry  system  (Seattle  Informatics  Management  System  or  SIM).  A 
summary  of  recruitment  to  date  is  provided  below  in  Table  1. 
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Table  1.  Summary  of  cumulative  study  accrual:  May  2004-Sep  2009 


Study  Population 

Cumulative  Study 
Recruitment  Goals 

TOTAL 

Actual  Accrual* 

Mammography  Cohort  --  High/Elevated 
Risk 

600 

570 

Mammography  Cohort  -  Average  Risk 

200 

53 

Mammography  Cohort  -  Biopsy 

400 

13 

Subtotal:  Mammography  Cohort 

1200 

636 

SMC  Surgical  Cohort:  Blood  and  Tissue 

175 

90 

SMC  Surgical  Cohort:  Blood  Only 

300 

296 

Subtotal:  Surgical  Cohort 

475 

337 

Cedars  Surgical  Cohort:  Blood  and  Tissue 

200 

118 

Total 

1475 

1091 

*  includes  only  women  who  have  donated  one  or  more  blood  specimens  and  who 
have  completed  a  baseline  questionnaire. 

All  CoE  participants  are  asked  to  donate  annual  serum  samples.  Fresh  frozen  tissue  is 
collected  from  surgical  participants  at  the  time  of  their  procedure  if  the  pathologist  deems 
it  clinically  appropriate.  Specimen  data  are  linked  to  extensive  epidemiological  and 
clinical  data  including  demographic  information,  information  from  GAIL  model  variables, 

Table  2.  CoE  Specimen  Summary 


Number  of  Collections*  (Seattle  Only) 

Blood  Products 

Tissue  Products 

Participant  Pathology 
(Most  Severe  Dx  for  that 
patient) 

Participants  with  available 
specimens 

Serum  (-thirteen  1  ml 
specimens/collection) 

EDTA  Plasma  (-four  1  ml 
specimens/collection) 

ACD  Plasma  (-one  4.5  ml 
specimen/collection) 

ACD  Buffy  Coat  Cells  (-2 

specimens/collection) 

Snap  Frozen  Tissue  (1-5 
specimens/tissue  site) 

OCT-Embedded  Frozen  Tissue  (3 

tissue  blocks/tissue  site) 

Formalin-Fixed,  Paraffin- 
Embedded  Tissue  (FFPE,  MRI 

Protocol  only) 

Atypia 

5 

11 

11 

7 

7 

0 

0 

0 

Benign 

18 

23 

23 

17 

18 

0 

0 

0 

Carcinoma  Unknown 

1 

2 

2 

2 

2 

0 

0 

0 

In  Situ 

60 

165 

160 

91 

91 

1 

1 

0 

Invasive/Infiltrating 

290 

732 

717 

398 

400 

100 

97 

8 

Normal 

585 

1550 

1535 

676 

677 

3 

0 

0 

Total 

959 

2483 

2448 

1191 

1195 

104 

98 

8 

“Collection”  refers  to  all  specimens  collected  from  a  single  participant  on  a  given  day.  Some 
participants  have  donated  specimens  2  or  3  separate  times. 
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pedigree,  breast  density,  BIRADS  assessment  code  and  follow-up  information  from 
screening  mammograms,  ER/PR  and  Her2  status,  staging  and  grade,  and  clinical  follow¬ 
up  for  treatment  and  recurrence.  Table  2  on  page  5  summarizes  the  CoE  specimens 
stored  in  our  repository.  We  have  also  collected  a  number  of  matched  pre-  and  post¬ 
diagnosis  serum  specimens  from  healthy  controls  that  went  on  to  develop  breast  cancer 
during  the  course  of  the  study  (Tables  3a-b.) 


Table  3a.  Preclinical  blood  samples  available:  Summary  of  breast  cancer  cases 
diagnosed  after  enrollment  in  women  who  have  donated  specimens. _ 


stage 

Study 

0 

1 

IIA 

MB 

III 

IMA 

IV 

Unavall. 

Total 

MAMMOGRAPHY 
COHORT-  Average, 

Elevated  and  High  Risk 

6 

3 

1 

1 

11 

MAMMOGRAPHY 
COHORT-  biopsy 

2 

1 

1 

4 

COE  Subtotal 

8 

4 

1 

1 

1 

0 

0 

0 

15 

OTHER  TOR  STUDIES 

27 

24 

5 

4 

0 

2 

1 

3 

Total 

35 

28 

6 

5 

2 

2 

1 

3 

19 

Table  3b.  Time  period  in  which  preclinical  (pre-diagnosis)  blood  samples  were 
collected. 


Months  from  blood  draw 
to  breast  cancer 
diagnosis 

Number  of  cases  per  time  period: 

COE  Mammography 

Cohort 

Participants  In  other  TOR 
studies^ 

0-6 

18 

18 

7-12 

5 

12 

13-18 

7 

13 

19-24 

5 

10 

25-30 

3 

5 

31  -36 

4 

4 

37-42 

3 

4 

43-48 

1 

3 

The  TOR  Laboratory  has  purchased  or  developed  assays  for  65  promising  candidate 
markers  for  evaluation  in  the  BDS  and  an  additional  14  potential  biomarkers  have  been 
identified  and  tested  in  the  BDS  by  our  collaborators.  The  BDS  is  provided  to 
investigators  in  a  blinded  fashion;  once  data  are  sent  back  to  the  TOR  laboratory, 
investigators  may  be  unblinded  to  case/control  status,  histologic  subtype,  stage,  and 
potentially  other  variables  on  request.  The  BDS  includes  66  specimens  from  66  patients. 
Characteristics  of  the  BDS  are  described  below  in  Table  4.  Of  the  79  biomarkers 
analyzed  in  the  BDS,  9  achieved  greater  than  30%  sensitivity  to  breast  cancer  cases  at 
the  90%  specificity  level:  SPARC^^  GDF15  (MIC1,  X065),  FN1,  SPP1  (Osteopontin), 
IL17,  COL1A1,  IGFBP2,  CTGF,  and  MMP7  (Figure  1).  Appendix  A  describes  these 
assays  in  greater  detail.  COL1A1,  IL17  and  SPARC  appeared  lower  among  cases  vs. 
controls,  while  the  remaining  6  markers  were  elevated  in  the  cases.  To  evaluate 
complementarily  of  these  the  most  sensitive  markers,  a  logistic  regression  model  was  fit 
using  the  6  most  sensitive  markers.  Coefficients  from  the  fitted  model  were  used  to  form 
a  panel  using  all  6  markers.  The  resulting  marker  panel  achieved  61%  sensitivity  at  90% 
specificity  (Figure  2). 
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Unpublished  Data 


True  Positive  Rate 


Table  4.  BPS:  Summary  of  participant  characteristics 


Healthy 

In-Situ 

Invasive  Cases 

Controls 

Cases 

Number  of  Collections 

33 

11 

22 

Number  of  Patients 

33 

11 

22 

Age  at  Collection  (mean,sd) 

53.8  (7.9) 

55.6  (4.4) 

53.2  (10.8) 

Collected  at  Surgery 

0  (0%) 

0  (0%) 

0  (0%) 

Current  OC  Use 

2(6.1%) 

0  (0%) 

0  (0%) 

Current  HRT  Use 

2(6.1%) 

0  (0%) 

2(9.1%) 

Stage  1 

N/A 

1  (9.1%) 

5  (22.7%) 

Stage  2 

N/A 

0  (0%) 

12(54.5%) 

Stage  3 

N/A 

0  (0%) 

4(18.2%) 

Tested  for  ER 

N/A 

10  (90.9%) 

22  (100%) 

Tested  for  PR 

N/A 

10(90.9%) 

22(100%) 

Tested  for  HER2 

N/A 

2(18.2%) 

19(86.4%) 

Triple  Neg  or  HER2  Pos 

N/A 

5  (45.5%) 

9  (40.9%) 

ROC  Curves  for  Serum  Markers 
with  greater  than  30%  Sensitivity  at  90%  Specificity 


Combination  markers 


Figure  1.  ROC  curves  for  serum  markers  tested  in  the  BDS  with  greater  than  30% 
Sensitivity  at  90%  Specificity.  Figure  2.  Combined  results  for  6  best  performing 
markers  in  the  BDS. 

This  year  investigators  put  together  a  Breast  Panel  Validation  Set  of  1067  specimens 
collected  in  Seattle  from  CoE  participants.  This  large  set  was  divided  into  two  smaller 
sets:  the  Panel  Development  Set  (PDS)  consisting  of  527  specimens  from  317 
participants,  and  the  Panel  Validation  Set  (PVS)  containing  537  specimens  from  323 
participants.  Tables  5a-b  below  summarize  patient  characteristics  across  both  sets. 
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Unpublished  Data 


Table  5a.  Breast 
Panel  Validation 
Set  Patient 
Characteristics 
part  1 

Healthy  Controls 

In-Situ  Cases 

Invasive  Cases 

Development  (D)  or 
Validation  (V)  Set 

D 

V 

D 

V 

D 

V 

Number  of 

Collections 

364 

378 

24 

20 

139 

139 

Number  of  Patients 

204 

205 

20 

17 

93 

101 

Age  at  Collection 

55.7  (8.9) 

55.5  (9.6) 

57  (9.5) 

56.3(13) 

57.9  (12.3) 

54.3(10.6) 

Collected  at  Surgery 

0  (0%) 

0  (0%) 

12  (50%) 

11  (55%) 

71  (51.1%) 

75  (54%) 

Current  OC  Use 

33  (9.1%) 

25  (6.6%) 

0  (0%) 

0  (0%) 

2(1.4%) 

4  (2.9%) 

Current  HRT  Use 

78  (21.4%) 

71  (18.8%) 

0  (0%) 

0  (0%) 

-6.50% 

8  (5.8%) 

Table  5b.  Breast  Panel  Validation  Set  Patient  Characteristics  part  2 


In-Situ  Cases 

Invasive  Cases 

Development  (D)  or 
Validation  (V)  Set 

D 

V 

D 

V 

Stage  1 

1  (4.2%) 

0  (0%) 

67  (48.2%) 

51  (36.7%) 

Stage  2 

1  (4.2%) 

0  (0%) 

43  (30.9%) 

55  (39.6%) 

Stage  3 

0  (0%) 

0  (0%) 

18  (12.9%) 

13(9.4%) 

Tested  for  ER 

21  (87.5%) 

20(100%) 

131  (94.2%) 

120  (86.3%) 

Tested  for  PR 

21  (87.5%) 

20(100%) 

131  (94.2%) 

120  (86.3%) 

Tested  for  HER2 

1  (4.2%) 

0  (0%) 

117(84.2%) 

116  (83.5%) 

Triple  Negative  or 
HER2  Positive 

3  (12.5%) 

7  (35%) 

21  (15.1%) 

32  (23%) 

A  total  of  seven  markers  were  tested  in  the  entire  Breast  Panel  Validation  Set.  Each 
marker  was  standardized  by  centering  and  scaling  the  results  so  the  healthy  controls 
had  a  mean  of  0  and  standard  deviation  of  1 .  Linear  models  were  fit  to  each  individual 
marker  in  the  PDS  to  evaluate  the  effects  of  covariates  on  serum  marker  levels.  As 
each  marker  was  standardized  prior  to  fitting  the  model,  coefficients  from  each  model 
can  be  interpreted  as  the  expected  change  in  marker  level  for  each  unit  change  of  the 
covariate  in  units  of  standard  deviations  in  the  healthy  controls.  For  example,  HE4  is 
expected  to  elevate  by  an  amount  equivalent  to  0.02  standard  deviations  in  the  healthy 
controls  when  age  is  increased  by  1  year  (table  6a).  Results  from  these  models  (Tables 
6a-b)  show  how  marker  levels  vary  by  characteristics  of  the  woman,  in  particular  the 
effects  of  malignancy  controlling  for  potentially  confounding  conditions.  Of  particular 
interest  is  the  coefficient  for  early-stage  disease,  which  is  statistically  significant  only  for 
SPARC.  MIC1  also  appears  to  provide  some  signal  in  in  situ  disease.  However,  none 
of  the  markers  achieved  20%  sensitivity  at  90%  specificity,  or  performed  well  in  ROC 
analysis  (Table  7). 

Age  at  the  time  of  collection  was  also  found  to  influence  five  of  seven  markers  (HE4, 
FN1,  MMP7,  COL1A1 ,  and  MIC1)  and  current  oral  contraceptive  (00)  use  was  found  to 
influence  four  of  seven  markers  (HE4,  FN1 ,  TFF3,  and  MIC1 ,  tables  6a-b).  Hormone 
replacement  therapy  and  blood  draw  conditions  were  found  to  influence  SPARC  (table 
6b).  These  factors  may  have  the  potential  to  confound  biomarker  discovery  and 
validation  experiments  and  should  be  accounted  for  in  future  experiments. 
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Unpublished  Data 


Table  6a.  Coefficients  and  p-Values  from  Linear  Models  (by  GEE)  in  Breast  Panel 
Development  Set  (n  =  529)  for  HE4,  FN1  and  TFF3 _ 


HE4 

FN1 

TFF3 

MMP7 

Coeff 

p-value 

Coeff 

p-value 

Coeff 

p-value 

Coeff 

p-value 

Age  at  Collection 

0.0200 

0.0020 

0.0104 

0.0490 

0.0092 

0.3072 

0.0361 

0.0000 

Current  OC  Use 

0.4870 

0.0250 

-0.3633 

0.0138 

1.3800 

0.0002 

-0.0767 

0.5543 

Current  HRT  Use 

-0.0239 

0.8674 

0.0380 

0.8052 

-0.0320 

0.6822 

-0.0450 

0.8028 

Collected  Day  of 
Surgery 

0.1530 

0.4009 

-0.1545 

0.2703 

-0.0508 

0.4651 

0.0575 

0.6387 

In  Situ 

-0.1785 

0.2584 

0.1299 

0.5763 

-0.0583 

0.6720 

-0.2575 

0.1464 

Early  Stage  Invasive 

0.3596 

0.1135 

-0.0027 

0.9871 

0.0837 

0.5574 

0.1743 

0.3405 

Late  -  Early  Stage 
Invasive 

0.6440 

0.3692 

0.1471 

0.6019 

-0.1216 

0.5412 

0.1133 

0.6737 

Triple  Negative  or 
HER2  Positive 

-0.1928 

0.4921 

0.3181 

0.4031 

0.3482 

0.2933 

0.2364 

0.3407 

Table  6b.  Coefficients  and  p-Values  from  Linear  Models  (by  GEE)  in  Breast  Panel 


Development  Set  (n  =  529 


for  COL1A1,  MIC1,  and  SPARC 


COL1A1 

MIC1 

SPARC 

Estimate 

p-Value 

Estimate 

p-Value 

Estimate 

p-Value 

Age  at  Collection 

0.0220 

0.0000 

0.0352 

0.0000 

-0.0089 

0.0932 

Current  OC  Use 

0.0169 

0.8987 

0.7605 

0.0316 

0.0926 

0.7087 

Current  HRT  Use 

0.2565 

0.0955 

-0.0702 

0.5789 

0.3360 

0.0354 

Collected  Day  of  Surgery 

-0.0486 

0.6574 

-0.0868 

0.6287 

-0.4297 

0.0006 

In  Situ 

-0.3221 

0.1989 

0.6447 

0.0378 

0.3746 

0.0686 

Early  Stage  Invasive 

-0.0935 

0.4902 

0.1333 

0.5146 

0.4800 

0.0044 

Late  -  Early  Stage  Invasive 

-0.3823 

0.0484 

0.9238 

0.2006 

0.2260 

0.5062 

Triple  Neg  or  HER2  Pos 

-0.0412 

0.8860 

0.1884 

0.6657 

0.1738 

0.5051 

Table  7.  Summary  of  ROC  Curves  in  the  PDS 


Sensitivity  at  95%  Specificity 

Sensitivity  at  90%  Specificity 

AUC 

Un-adjusted 

Adjusted* 

Un-adjusted 

Adjusted* 

Un¬ 

adjusted 

Adjusted* 

MIC1 

0.150 

0.059 

0.195 

0.118 

0.538 

0.530 

FN1 

0.044 

0.066 

0.106 

0.160 

0.531 

0.555 

SPARC 

0.071 

0.049 

0.168 

0.127 

0.560 

0.570 

COL1A1 

0.080 

0.066 

0.142 

0.160 

0.608 

0.601 

HE4 

0.062 

0.113 

0.150 

0.160 

0.569 

0.519 

TFF3 

0.027 

0.075 

0.062 

0.104 

0.460 

0.520 

MMP7 

0.133 

0.094 

0.177 

0.198 

0.567 

0.545 

*  Adjusted  for  Age,  Surgical  Conditions  in  First  Collection  w/o  Current  OC  or  HRT  Use 


Since  the  2006  CoE  workshop  in  Arlington,  Virginia,  resources  have  been  devoted  to 
conducting  discovery  work  in  both  serum  and  breast  tissue,  since  there  are  not  enough 
candidate  markers  currently  ready  or  available  for  evaluation.  This  year  CoE  investigator 
Dr.  Michel  Schummer  completed  a  discovery  project  using  RNA  extracted  from  tissues 
of  CoE  cases  and  controls  to  look  at  expression  in  genes  that  have  been  identified  in  the 
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literature  or  by  our  collaborators  as  potentially  involved  in  the  development  of  breast 
cancer.  All  tissue  specimens  were  characterized  by  histology  and  underwent  RNA 
extraction.  Dr.  Schummer  then  selected  a  subset  of  94  tissues  from  64  participants 
whose  histology  corresponded  to  the  patient  diagnosis  to  be  used  in  real-time  PCR 
experiments  evaluating  gene  expression.  (Table  8) 


Table  8.  Specimen  Set  for  Real-Time  PCR 


Histology 

Patients 

Specimens 

Healthy  Control 
(mammaplasty) 

20 

28 

IpsI-  or  contralateral 
Normal 

37 

38 

Invasive  Case 

23 

25 

15  patients  donated  both  cancer  and  IpsI-  or  contralateral 
tissue 

In  total,  Dr.  Schummer  has  examined  134  genes  and  identified  46  as  possible  marker 
candidates  for  validation  in  serum.  Genes  were  ranked  by  their  ability  to  discriminate 
between  invasive  cases  and  healthy  mammaplasty  controls  using  a  threshold  of  3 
standard  deviations  above  the  mean  of  the  controls  if  expression  in  cases  is  higher  than 
in  controls,  or  the  lowest  value  of  the  controls  in  the  case  of  genes  with  a  lower 
expression  in  cases.  A  transcript  marker  was  terminated  in  <20%  of  invasive  cancer 
samples  that  were  higher  or  lower  than  threshold.  The  results  of  this  project  are 
described  further  on  page  16,  under  “Key  Research  Accomplishments.”  Markers  that 
had  serum  assays  readily  available  have  been  evaluated  in  serum  by  the  TOR 
Laboratory.  So  far,  41  protein  markers  identified  by  this  project  have  been  tested  in  the 
BDS.  There  are  32  marker  genes  for  which  no  serum  assays  is  commercially  available 
but  assay  development  for  8  of  these  has  been  started.  The  top  eight  candidate  markers 
identified  by  PCR  were  tested  in  the  BDS  using  available  ELISA  assays:  SPARC, 

GDF15  (MIC1),  FN1®,  SPPf ,  COL1A1,  CTGF,  MMP7,  and  WFDC2  (HE4). 

The  CoE  specimen  repository  is  being  utilized  by  collaborators  for  related  discovery  and 
early  detection  work.  For  example,  CoE  collaborator  Dr.  Samir  Hanash  of  the  Fred 
Hutchinson  Cancer  Research  Center  is  using  specimens  from  our  repository  for  his 
project  titled,  “Alliance  of  Glycobiologists  for  Detection  of  Cancer  and  Cancer  Risk”  (U01 
CA1 28427.)  The  study  involves  implementation  of  a  new  paradigm  in  the  use  of  glycan 
biomarkers  for  early  detection  of  cancer.  Dr.  Hanash’s  Lab  used  CoE  and  other 
specimens  for  biomarker  discovery  and  contributed  13  candidate  markers  for  evaluation 
in  the  BDS.  Three  of  those  markers  were  among  the  top  nine  performing  markers 
measured  in  the  BDS  and  went  forward  to  be  measured  in  the  PDS  and  the  PVS  to 
validate  these  initial  findings.  Unfortunately,  two  of  the  markers  (CTGF,  FN1 )  did  not 
perform  well  in  the  larger  serum  set.  The  third  marker  (IGFBP4)  is  still  being  measured 
and  results  are  pending. 

Dr.  Tony  Blau  received  CoE  tissue  specimens  to  evaluate  the  expression  of  the  Epo 
receptor  in  breast  tumors.  His  preliminary  work  helped  him  to  obtain  Avon-NCI 
partnership  funding  through  the  SPCRE  supplemental  mechanism.  He  went  on  to  use 
CoE  tissue  samples  for  laser  capture  fractionation  with  the  goal  of  determining  the  types 
of  cells  within  primary  tumors  that  express  erythropoietin  receptor.  In  addition,  he 
evaluated  the  presence  of  a  single  nucleotide  polymorphism  (SNP)  1 120  bp  upstream 
from  the  Epo  promoter  region  that  has  been  associated  with  elevated  Epo  levels  in 
vitreous  fluid.  Serum  Epo  and  EpoR  levels  were  measured  for  the  same  participants  for 
correlative  studies.  He  has  published  one  manuscript  to  date^  and  a  second  one  has 
been  submitted.  Table  9  below  enumerates  all  collaborators  who  have  received  CCE 
specimens  and  the  current  outcomes  of  that  work. 
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Table  9.  COE  Specimens  provided  to  Investigators  for  Biomarker  Measurement 


Investigator  Requesting 
Specimens  or  Information 

Specimens  Received 

Markers  Measured 

Outcome 

Victor  Levenson,  MD,  PhD 

BDS,  Piasma 

Methyiation 

markers 

in  progress 

Samir  Hanash,  MD,  PhD 
(FHCRC) 

BDS,  Piasma  and 
Serum 

CTGF,  CXCL1, 
FN1,  IGFBP2, 
IGFBP4,  IGFBP5 

3  markers  moved  on 
for  measurement  in 
PDS  and  PVS 

Eieftherios  Diamandis  (Mt. 
Sinai,  Toronto) 

BDS,  Serum 

Two  unknown 
candidates 

Not  promising 

C.  Anthony  Biau,  MD  (UW) 

Snap  frozen  and  OCT- 
embedded  frozen 
tissues,  FFPE  tissues 

Epo,  EpoR,  Epo 
SNPs 

Avon  funding, 
manuscript 
pubiished 

diaDexus,  Inc. 

BDS 

X065  (MIC-1) 

Promising  -  moved 
on  for  measurement 
in  PDS  and  PVS 

Andre  Baron,  MD,  PhD 
(University  of  Kentucky) 

BDS,  Serum 

sEGFR 

in  progress 

BioCurex 

BDS,  Serum 

RECAF 

in  progress 

Samir  Hanash,  MD,  PhD 
(FHCRC) 

PDS,  PVS 

TTF3 

Not  promising 

Our  consumer  advocates  have  played  an  important  role  throughout  our  CoE.  Their 
activities  have  included  review  of  new  participant  materials,  participation  in  scientific 
meetings,  and  working  with  investigators  to  determine  how  best  a  panel  of  markers 
could  be  used  in  a  clinical  setting. 

During  the  last  funding  period  we  have  held  four  scientific  meetings.  Recent  topics 
include  preliminary  results  from  Dr.  Schummer’s  discovery  profiling  work,  and  a 
presentation  by  Dr.  Melanie  Palomares  from  City  of  Hope  in  Duartes,  CA  on  the 
detection  of  circulating  tumor  cells  using  quantitative  real-time  PCR.  The  2009  CoE 
Annual  Workshop  was  held  in  August  and  the  agenda  is  included  as  Appendix  C.  Over 
the  years  these  meetings  have  included  presentations  by  study  investigators, 
collaborators  and  outside  experts  on  work  relevant  to  the  aims  of  the  CoE  with  additional 
time  allowed  for  discussion.  Presentations  in  2009  have  focused  on  the  results  related  to 
study  aims  1  and  3. 

Investigators  continue  to  refine  the  draft  manuscript  of  the  results  from  an  investigation 
of  the  impact  of  DCIS  detection  and  treatment  on  breast  cancer  mortality  and  associated 
over  diagnosis  using  a  micro  simulation  model  (manuscript  is  titled  Quantifying  Risks  of 
Breast  Cancer  Mortality  and  Overdiagnosis  due  to  Mammography-diagnosed  DCIS.) 

This  is  a  continuation  of  work  that  was  initially  developed  through  a  previously  funded 
DOD  grant  (DAMD1 7-94-J-4237).  The  primary  focus  during  this  final  year  of  the  CoE 
has  been  on  assay  development  and  biomarker  evaluation.  Investigators  plan  to  finish 
the  micro  simulation  model  manuscript  in  2010. 

Below  we  outline  each  task  included  in  our  Statement  of  Work  and  detail  efforts  toward 
completion  of  each  task.  In  October  2008,  we  applied  for  a  no  cost-extension  extending 
this  study  for  12  additional  months;  therefore,  this  report  represents  our  entire  progress 
to  date  for  all  7  years  of  funding  (months  1-84). 

TASK  1:  Recruit  women  undergoing  mammography  to  donate  seriai  biood 
sampies  (Mammography  Cohort)  (compieted) 

Task  la:  Obtain  Consent  to  Contact  and  Screening  Questionnaire  from  women 

undergoing  mammography  at  participating  facilities  (complete).  This  task  was 
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conducted  between  May,  2004  and  September  30,  2008.  Recruitment  of  potential 
participants  occurred  at  Swedish  Medical  Center  mammography  clinics  and  community 
health  and  outreach  events  such  as  the  Swedish  SummeRun,  which  raises  money  and 
awareness  for  ovarian  cancer  research.  During  the  study  period  consent  to  contact  was 
obtained  from  1,986  women.  These  women  were  sent  a  short,  one  page  screening 
questionnaire  to  collect  preliminary  risk  information.  66%  of  the  screening  questionnaires 
have  been  returned.  Information  from  these  forms  was  entered  into  the  SIM  database 
and  was  used  to  select  eligible  women  to  approach  for  the  CoE  study. 

Task  1b:  Obtain  mammography  data  from  participating  facilities  (complete).  Collection 
of  mammography  data  for  this  study  has  been  completed.  Over  the  course  of  the  CoE 
we  have  obtained  5  data  downloads  from  Swedish  Medical  Center’s  Mammography 
Reporting  System  (MRS),  an  electronic  database  used  by  Swedish  Medical  Center 
radiology  facilities.  For  participants  who  receive  mammograms  outside  of  Swedish 
Medical  Center  we  use  self-reported  information  from  the  health  status  or  baseline 
questionnaire  to  determine  the  location  and  date  of  the  woman’s  most  recent  screening 
mammogram  and  contact  the  hospital  or  clinic  directly  to  request  a  copy  of  the  report 
and  any  subsequent  diagnostic  reports  (if  applicable.)  The  reports  are  then  abstracted 
by  trained  study  staff  into  the  same  data  entry  screens  in  SIM  that  store  MRS  data. 

We  run  a  linking  algorithm  to  match  study  participants  to  their  mammography  results 
from  MRS.  Using  this  data  we  have  been  able  to  incorporate  mammography  information 
such  as  assessment  code,  density  and  follow-up  recommendations  into  our  risk 
algorithm.  Approximately  80%  of  our  participants  have  electronic  records  in  the 
Mammography  Reporting  System. 

Task  1c:  Using  on-going  sampling  technique,  stratify  population  by  risk  (complete). 

Information  collected  on  our  study  questionnaires  and  mammography  results  are  used  to 
stratify  our  study  population  by  risk;  that  is,  allowing  us  to  characterize  a  woman  as  high, 
elevated  or  average  risk.  A  woman  is  determined  to  be  at  high  risk  based  on  family 
history,  if  she  is  of  Ashkenazi  Jewish  descent,  self-reports  a  positive  test  for  the  BRCA  1 
or  BRCA  2  mutation,  or  has  prior  history  of  receiving  a  breast  biopsy.  A  woman  is 
determined  to  be  at  elevated  risk  for  breast  cancer  by  GAIL  Model,  breast  density, 
mammography  assessment  codes,  or  mammography  follow-up  recommendations.  The 
table  below  summaries  the  number  of  women  enrolled  into  the  mammography  cohort 
and  associated  risk  levels  based  on  collected  information.  All  women  reported  in  the 
table  have  donated  specimens  and  completed  baseline  questionnaires. 


Table  10.  Mammography  Cohort  Breakdown  by  Risk 


Mammography  Cohort  participants  donating  specimens 
with  compieted  baseiine  data 

Risk  Category 

Participants 

Percentage  of  Totai 

High 

233 

37.4% 

Eievated 

337 

54.1% 

Average 

53 

8.5% 

Total 

623 

100% 

Task  Id:  Approach  selected  women  for  blood  donation  (complete).  Participants  were 
approached  for  blood  donation  beginning  in  October,  2004  and  ending  in  September, 
2008.  Participants  are  asked  to  donate  blood  on  or  close  to  the  date  of  their  annual 
screening  mammogram.  Of  the  623  mammography  cohort  participants  who  have 
donated  blood  for  the  study,  449  completed  serial  draws. 

Task  1e:  Send  blood  donation  appointment  letters  and  epidemiologic  risk  factor 

questionnaires  to  consenting  women  (complete).  To  date,  637  women  have  completed 
one  or  more  study  blood  draws.  All  of  these  women  received  an  epidemiologic  risk  factor 
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questionnaire  at  baseline  and  a  shorter  health  status  questionnaire  at  each  blood  draw 
appointment  (provides  updated  information  about  medical  history  variables  that  may 
affect  marker  status).  100%  have  completed  and  returned  the  health  status 
questionnaire  and  601  (96%)  completed  and  returned  the  baseline  questionnaire. 
Updated  family  history  information  is  collected  on  an  end-of-study  supplemental 
questionnaire  mailed  out  after  the  participant’s  final  study  blood  draw. 

Task  If:  Receive  and  data  enter  questionnaires  (complete).  Questionnaires  are  entered 
within  48  hours  of  receipt.  Quality  control  data  entry  is  performed  on  all  baseline 
questionnaires  and  approximately  10%  of  the  health  status  questionnaires.  The 
database  manager  periodically  reviews  quality  control  data  entry  on  all  questionnaires  to 
track  our  error  rates  and  ensure  quality  control  entry  is  occurring  on  a  sufficient  number 
of  questionnaires. 

TASK  2;  Recruit  women  undergoing  stereotactic  biopsy  to  donate  pre-biopsy  and 
seriai  foiiow-up  biood  sampies  (Biopsy  Cohort) 

Task  2a:  Finalize  approach  procedures  to  be  used  by  Swedish  Breast  Care  Center 

(complete).  In  September  2001,  Dr.  Urban  received  funds  from  an  NCI-Avon  “Progress 
for  Patients”  award  (P5QCA83636)  that  allowed  us  to  develop  and  test  procedures  to 
recruit  and  enroll  women  who  were  undergoing  stereotactic  biopsy  at  the  Swedish 
Breast  Care  Center  (SBCC),  part  of  SMC.  For  this  “Avon  study”  women  were  asked  to 
provide  a  one-time,  pre-biopsy  blood  donation  and  complete  both  the  baseline  and 
health  status  questionnaires.  143  women  were  enrolled  in  this  study  at  SBCC.  We 
adopted  the  same  procedures  to  recruit  and  enroll  women  scheduled  for  breast  biopsies 
at  SBCC  into  the  CoE  study.  Women  enrolled  into  the  CoE  were  asked  to  give  a  blood 
sample  prior  to  their  biopsy  procedure  in  addition  to  an  annual  sample  at  the  time  of 
subsequent  mammograms. 

Task  2b:  Specimen  Collection  Specialist  attends  biopsy  appointment  to  obtain  informed 

consent,  collect  pre-biopsy  blood  sample,  and  provide  epidemiologic  risk  factor 

questionnaire  (complete).  Biopsy  patients  were  approached  and  enrolled  in  this  study 
from  January,  2007-August,  2008.  To  date  13  participants  have  been  enrolled  and 
donated  a  blood  sample  just  prior  to  their  biopsy  (1  invasive  case,  2  in  situ  and  10 
benign  controls). 

TASK  3:  Recruit  women  undergoing  surgery  to  donate  pre-surgery  and  foiiow-up 
biood  sampies,  and  coiiect  tissue  on  seiected  breast  cancer  cases  (Surgicai 
cohort). 

Task  3a:  Work  with  surgeons’  offices  to  integrate  patient  approach  procedures  into  the 

patient  care  flow,  (complete).  We  have  worked  closely  with  participating  breast  surgeons 
and  clinic  staff  to  design  and  implement  patient  approach  procedures  for  recruitment  that 
have  proven  to  be  successfully  integrated  with  normal  clinic  flow.  Qur  study  personnel 
are  able  to  maintain  an  open  dialogue  with  participating  physicians  about  study  progress 
and  procedures  by  checking  in  with  them  and  their  staff  on  a  daily  basis.  This  creates  an 
environment  where  physicians  and  study  staff  are  able  to  work  together  to  continuously 
refine  and  improve  our  approach  procedures. 

Task  3b:  Pilot  patient  approach  and  specimen  collection  procedures  (complete). 

Patient  approach  began  in  July,  2004.  Swedish  Medical  Center  breast  surgeons  identify 
patients  that  are  likely  candidates  for  surgical  specimen  collection  and  at  the  pre-surgical 
visit  approach  these  patients  about  study  participation.  If  the  patient  is  interested,  the 
physician  will  obtain  verbal  consent  for  study  staff  to  contact  the  patient  either  in  person 
or  by  phone.  If  a  study  staff  member  is  present  at  the  clinic,  the  physician  invites  the 
woman  to  speak  to  the  study  representative  who  can  help  answer  immediate  questions 
or  concerns.  If  the  patient  chooses,  she  may  be  enrolled  at  this  time  (if  she  meets  the 
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eligibility  requirements).  Otherwise,  study  staff  contacts  her  by  phone  to  discuss  the 
study  in  further  detail  and  set  up  an  enrollment  appointment  to  conduct  in-person 
informed  consent  and  collect  a  pre-surgical  blood  sample. 

Task  3c:  Routinely  approach  selected  women  undergoing  surgery  for  blood  and  tissue 

collection  or  blood  only  collection  (complete).  Surgical  patients  in  Seattle  were 
approached  and  enrolled  in  the  CoE  from  July,  2004-September,  2009.  Since  April,  2008 
recruitment  has  been  limited  to  only  those  patients  whose  specimens  were  still  needed 
for  the  discovery  work  described  on  page  10.  To  date,  we  have  enrolled  90  participants 
in  Seattle  from  SMC  who  have  successfully  completed  questionnaire  data  and  donated 
blood  and  tissue,  and  296  who  have  completed  questionnaire  data  and  donated  only 
blood. 

Task  4.  Recruit  women  undergoing  biopsy  or  surgery  to  donate  a  one-time  oniy 
pre-surgicai  biood  and  tissue  sampie,  as  feasibie,  at  Cedars  Sinai  Medicai  Center. 

Task  4a:  Finalize  approach  procedures  to  be  used  by  Dr.  Scott  Karlan  at  Cedars-Sinai 

Medical  Center  (complete).  This  task  has  been  completed  and  the  Cedars-Sinai  Clinical 
and  Recruitment  protocol  received  DoD  Human  Subjects  approval  in  July  2005. 

Drs.  Scott  and  Beth  Karlan  have  approached  physicians  who  attend  Breast  Center 
conferences,  to  educate  them  about  available  research  protocols  for  interested  patients. 
Recruitment  flyers  and  brochures  are  posted  around  the  Cedars  Sinai  campus 
(specifically,  the  Saul  and  Joyce  Brandman  Breast  Center  and  the  Cedars-Sinai 
Outpatient  Surgery  Center)  and  made  available  to  raise  patient  awareness.  This  study  is 
also  listed  on  the  Cedars-Sinai  web  site. 

Eligible  women  previously  scheduled  for  a  breast  surgical  procedure  that  involves  the 
removal  of  some  or  all  of  their  breast  tissue  are  approached  about  possible  study 
participation.  Patients  are  not  scheduled  for  surgical  procedures  for  the  purpose  of  this 
study  alone.  The  Principal  Investigator,  co-investigators,  or  treating  physicians  (usually 
a  breast  surgeon,  occasionaly  a  radiologist  or  a  medical  oncologist)  help  identify 
potential  subjects.  The  treating  physician  makes  initial  contact  with  potential  subjects 
and  contacts  a  trained  study  staff  member  to  consent  the  patient  into  the  study  if  the 
woman  agrees  to  participate. 

Task  4b:  Routinely  approach  selected  women  for  blood  and  tissue  collection  (complete). 

Drs.  Beth  and  Scott  Karlan  and  their  study  staff  recruited  eligible  women  into  the  COE 
study  at  Cedars  Sinai  Medical  Center  from  October  2005-April,  2008.  Their  study 
enrollment  goal  was  50  surgical  women  per  year  for  the  duration  of  the  study.  The 
population  includes  healthy  women  with  no  disease,  women  with  benign  lesions  and  pre- 
malignant  breast  diseases,  and  women  with  in-situ  and  invasive  carcinoma.  Of  the  77 
surgical  participants  who  completed  the  baseline  questionnaire,  69  donated  both  blood 
and  tissue  and  8  donated  only  blood. 

Task  4c:  Surgeon  to  collect  healthy  tissue,  benign  lesions,  atypia,  in  situ  disease,  and 

invasive  carcinoma  tissue  samples.  (Complete).  The  Cedars  Sinai  team  has 
implemented  the  shared  tissue  collection  protocol  and  has  collected  tissue  samples  from 
196  study  participants.  Pathology  information  is  centrally  abstracted  at  FHCRC  using  a 
Patient  Level  Clinical  Diagnosis  form. 

Immediately  after  the  surgeon  has  removed  the  necessary  tissue  and  the  pathologist 
has  taken  what  is  required  for  pathologic  diagnosis,  a  study  Specimen  Collection 
Specialist  is  permitted  to  collect  specimens  from  the  removed  tissue  for  the  purposes  of 
the  CoE.  All  or  part  of  the  un-needed  tissue  is  collected,  labeled  and  processed  for 
storage.  The  tissue  is  embedded  in  OCT  and/or  snap  frozen.  Tissue  collected  includes 
malignant  tissue  with  adjacent  normal  tissue,  as  well  as  tissue  from  pre-malignant 
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lesions  and  breast  tissue  from  normal  patients  undergoing  plastic  surgery  procedures  at 
Cedars-Sinai. 

The  Patient  Level  Clinical  Diagnosis  form  uses  information  that  has  been  abstracted 
from  pathology  and  other  medical  reports  to  characterize  a  woman  based  on  TNM 
staging  and  grade  of  disease  at  the  time  of  her  diagnosis.  A  FHCRC  study  staff  member 
completes  this  form  for  all  CoE  surgical  participants  with  the  research  nurse  conducting 
quality  assurance. 

Breast  Tissue  Histology  Review 

Working  closely  with  breast  pathologists  Drs  Sean  Thornton  and  Ellen  Pizer  of 
Washington  Pathology  Consultants,  we  have  created  two  forms:  Breast  Histology  Tissue 
Review  Form  and  a  Clinical  Status  Follow-Up  Form  that  are  used  to  characterize  breast 
tissue  samples  and  capture  treatment  and  disease  status  post-diagnosis  and  surgery.  In 
2006  a  review  of  a  pilot  group  of  60  tissues  from  10  participants  was  conducted  in  an 
effort  to  discover  whether  the  most  severe  diagnosis  listed  on  the  patient  pathology 
report  matched  the  actual  histology  of  the  tissue  specimens  collected.  The  results 
(reported  in  the  2007  annual  progress  report)  led  us  to  conclude  that  tissue  specimens 
must  be  independently  reviewed  to  accurately  determine  histology;  it  is  insufficient  to 
rely  solely  on  the  pathology  report  from  surgery.  At  this  time,  review  is  only  performed  on 
tissues  being  used  for  discovery. 


Task  5.  Blood  samples  from  Mammography,  Biopsy  and  Surgical  Cohorts  are 
collected,  processed  into  serum  and  plasma  cryovials,  and  logged  into  specimen 
tracking  system  (complete). 

In  all  blood  collections,  the  Specimen  Collection  Specialist  collects  up  to  50  ml  of  whole 
blood.  At  the  initial  collection  the  phlebotomist  will  distribute  the  blood  between  3  red  top 
(serum)  tubes,  1  purple  top  (EDTA  plasma)  tube,  and  one  yellow  top  (ACD-plasma  and 
lymphocytes)  tube.  For  all  subsequent  draws,  blood  is  collected  in  4  red  top  tubes  and  1 
purple  top  tube. 

Standard  protocols  are  followed  to  process  specimens  into  sera  and  plasma  and  aliquot 
them  into  cryovials  uniquely  labeled  with  study  specimen  ids.  Specimens  are  then 
logged  into  the  Specimen  Tracking  System  database  (STS). 

The  blood  specimens  are  stored  in  1  ml  quantities  to  avoid  damaging  freeze-thaw 
cycles.  Aliquoted  specimens  are  entered  into  the  specimen  tracking  system  then 
transported  to  the  study  repository  for  long-term  storage  and  will  eventually  be  delivered 
to  laboratory  investigators  for  future  analysis.  Blood  draw  date  and  time,  and  time  of 
processing  and  freezing  are  recorded  in  STS  as  well. 


Task  6.  Revise  existing  ovarian  cancer  database  to  accommodate  breast  tissue 
specimens  and  questionnaire 

Task  6a:  Analyze  current  system  and  prepare  preliminary  assessment  of  revised 

software  design  specifications  (complete).  FHCRC  programmers  have  enhanced  an 
existing  specimen  tracking  system  (STS)  to  accommodate  specimens  and  breast 
specimen  data  being  collected  as  part  of  the  CoE.  We  currently  track  the  following 
specimen  data:  date  of  blood  and/or  tissue  donation,  specimen  processing,  amount  of 
specimen  collected,  types  of  specimen  storage,  and  storage  location  of  specimen  aliquot 
or  tissue  vial  or  block. 

Task  7.  Develop  an  implementation  test  utilizing  proposed  software  with  a  middle 
tier  and  internet  interface  for  the  Clinical  Data  Module  (complete). 

Infrastructure  in  place  includes:  web  server  hardware,  web  service  software,  access 
security,  data  entry  form  templates,  and  referential  integrity  between  database  objects. 
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We  have  refined  an  Access  database  to  track  information  that  is  collected  on  our  Patient 
Level  Clinical  Diagnosis  Form.  This  form  provides  appropriate  information  to 
characterize  a  woman  based  on  TNM  staging  and  grade  of  disease  at  the  time  of  her 
diagnosis.  It  also  captures  receptor  status  information,  such  as  estrogen  or 
progesterone  positivity/negativity,  which  will  be  used  to  select  specimens  for  the  different 
specimen  sets.  This  database  acts  as  our  “clinical  module”  and  is  linked  to  SIM,  our 
primary  data  management  system,  which  in  turn  is  linked  to  the  STS  and  SpecimenDB 
(see  task  8  below  for  more  detail.) 

Web  based  screens  in  SIM  for  questionnaire  data  entry  and  patient  tracking  have  been 
developed.  Routines  for  data  validation  with  each  submission  of  data  to  the  server  have 
been  implemented.  Every  value  entered  is  checked  for  validity.  Any  outliers  are 
returned  to  the  data  entry  specialist  for  verification  before  the  data  are  committed  to  the 
database.  In  addition,  attempts  to  re-enter  data  that  have  previously  been  collected,  are 
preempted  via  referential  integrity. 

Task  8.  Develop  breast  specimen  tracking  database  to  replicate  and  enhance  the 
current  system's  functionality  adjusting  per  information  gained  in  the 
implementation  test  (complete). 

In  2006  Staff  Scientist  Dr.  Michel  Schummer  developed  SpecimenDB,  a  FileMaker 
database  for  information  that  is  generated  from  our  specimens,  such  as  experimental 
and  specimen  processing  results.  SpecimenDB  also  serves  as  a  front-end  to  CoE 
databases  SIM,  STS  and  the  Access  database  tracking  our  Patient  Level  Clinical 
Diagnosis  Form.  The  interface  provides  a  unified  look  across  all  components  and  is  thus 
easy  to  navigate.  Each  field  can  be  searched  without  knowledge  of  the  underlying 
structure.  Summary  reports  can  be  generated  from  any  view  as  Excel  or  PDF 
documents.  SpecimenDB  is  client-  and  web-based,  the  latter  allowing  for  collaboration 
across  sites.  Although  the  back-end  consists  of  several  databases,  the  user  sees  just 
three  major  areas:  Specimens,  Patients  and  Results. 

The  Specimens  area  holds  data  about  the  processing  of  the  specimens,  such  as  RNA 
extraction  (Figure  3a.)  This  allows  for  technicians  to  enter  information  pertaining  to 
specimen  processing.  Having  this  information  in  a  central  location  will  prevent  us  from 
distributing  a  specimen  that  was  previously  known  to  yield  poor  RNA  or  protein.  The 
Specimens  area  also  has  a  view  that  lists  multiple  specimens  in  rows  which  allows  for 
intuitive  searches  and  the  generation  of  summary  reports. 

The  Patients  area  holds  patient-related  information  that  has  been  stripped  of  identifying 
information,  including  the  pathology  reports,  both  abstracted  and  a  scanned  copy. 

Similar  to  the  specimen  area,  it  is  possible  to  toggle  between  views  that  list  detailed 
information  about  a  single  or  multiple  patients.  In  list  view,  it  is  further  possible  to  toggle 
between  patients  and  their  specimens,  allowing  for  simultaneous  querying  of  patient  and 
specimen  information. 

The  Report  area  is  designed  to  contain  experimental  data  obtained  from  the  specimens 
in  our  repository.  We  have  designed  a  database  module  (written  in  FoxPro)  that  keeps 
track  of  our  serum  and  plasma  marker  measurement  workflow,  including  the  results. 
Although  optimized  to  work  with  our  laboratory,  this  module  can  also  accommodate 
results  data  generated  in  other  laboratories.  We  are  currently  expanding  SpecimenDB 
capabilities  to  link  to  these  results.  This  will  allow  us  to  perform  queries  across  patients, 
specimens  and  results  simultaneously  in  an  extremely  user-friendly  manner.  The  new, 
integrated  view  will  look  very  similar  to  the  current  view. 

Task  9.  Develop  collaborative  web  site 
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Task  9a:  Develop  site  to  support  real-time  discussion  and  information  sharing  among 

investigators  (complete).  Investigators  and  study  staff  continue  to  utilize  the  CPAS 
website,  created  by  Dr.  Martin  McIntosh  and  his  Computational  Proteomics  Laboratory 
group  at  the  Fred  Hutchinson  Center.  CPAS  is  an  open-source  science  portal  offering 
web-based  bioinformatics  and  collaboration  tools  to  help  scientists  store,  analyze  and 
share  data  from  high-throughput  experiments  and  clinical  trials®.  CPAS  is  available  as 
free,  installable  software,  with  source  code.  This  work  was  funded  by  NCI  subcontract 
23XS144A.  Study  investigators  and  staff  use  CPAS  to  support  real-time  communication 
and  information  sharing  among  FHCRC  staff,  CoE  investigators  and  their  respective 
staff.  A  username  and  password  are  required  to  access  information  on  this  site.  The 
content  on  CPAS  is  organized  hierarchically  into  projects  and  subfolders,  much  like  the 
file  directories  on  your  computer;  therefore,  users  find  it  easy  to  navigate  through  and 
use. 

In  addition  to  CPAS,  from  July,2004-July,  2009  we  maintained  a  second  website  to 
function  as  a  study  reference  to  outside  researchers  and  the  general  public.  The  website 
consisted  of  four  main  sections:  a  Homepage,  Research  Overview,  Advocacy,  and 
Community  Events.  There  was  also  a  link  to  our  internal  CPAS  site  accessible  only  to 
project  investigators  and  staff. 

Task  9b:  Develop  extensions  that  will  give  investigators  ability  to  query  specimen 

tracking  system  and  download  summary  reports  (complete).  SpecimenDB  (explained  in 
detail  under  Task  8)  tracks  both  specimen  and  patient  related  (clinical)  information.  Its 
unification  of  several  databases  allows  investigator-generated  queries.  For  example,  a 
user  can  select  patients  that  match  certain  clinical  criteria  and  click  on  the  “toggle 
specimens”  button.  Available  specimens  matching  the  criteria  will  be  shown.  The  user 
can  then  search  for  subsets  of  these  specimens,  such  as  available  serum  volume. 
Queries  can  be  performed  in  increments,  which  will  allow  the  investigator  to  review  the 
data  between  steps.  Multiple  AND  or  OR  statements  can  be  applied  without  knowledge 
of  the  underlying  database  structure.  Once  a  subset  of  records  has  been  identified,  a 
summary  report  can  be  generated  through  pre-configured  templates,  or  ad-hoc,  through 
user-selection.  To  facilitate  this  process,  field  names  are  the  same  in  the  user  interface 
as  in  the  underlying  database.  In  addition,  the  CoE  CPAS  site  is  linked  to  the  study’s 
data  management  system;  therefore,  investigators  are  able  to  access  and  view  data 
reports  as  if  they  were  in  the  SIM  system. 

Task  9c:  Develop  web  pages  for  each  investigator  that  are  linked  to  collaborative  site 

(complete).  We  have  developed  folders  on  CPAS  for  each  laboratory  based 
investigator.  Each  investigator  is  able  to  design  their  own  folder  and  create  subfolders 
suiting  their  specific  needs;  however,  we  request  that  investigators  use  their  folders  to 
upload  all  laboratory  results  and  to  view  marker  results.  We  have  also  developed  folders 
to  support  investigator  specific  meetings  and  collaborative  activities,  such  as  the 
quarterly  investigator  calls  and  the  developing  Specimen  Review  Committee.  In  addition, 
we  have  created  a  folder  that  is  open  to  the  public  to  support  the  upcoming  CoE 
investigator  meetings. 

Task  10:  Prepare  and  Analyze  BDS 

Task  10a:  Provide  samples  from  a  set  of  66  women  to  laboratory  investigators 

(complete).  BDS  specimens  were  provided  to  study  laboratory  investigators  to 
determine  the  preliminary  usefulness  of  new  markers.  Characteristics  of  the  participants 
whose  specimens  make  up  the  BDS  are  described  above  in  table  4.  The  assays  used  to 
measure  markers  in  the  BDS  are  described  in  Appendix  A. 

We  will  apply  for  future  funding  to  continue  working  with  our  collaborators  listed  in 
Appendix  B  to  identify  new  markers  and  test  their  efficacy  in  the  BDS.  New  funding  will 
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also  allow  promising  markers  to  continue  through  the  pipeline  and  be  evaluated  in  the 
PDS  and  PVS.  In  addition  the  BDS  and  the  Breast  Panel  Validation  Sets  are  available  to 
new  collaborators  that  may  have  promising  markers  or  discovery  platforms  through  an 
RFA  mechanism. 

Tasks  10b  and  10c:  Statistical  analysis  of  preliminary  BDS  results  (complete).  Statistical 
analyses  on  preliminary  assay  results  from  the  BDS  were  completed  this  year  and  the 
results  are  summarized  on  page  7  in  Figures  1  and  2. 

Tasks  11-13:  Biomarker  Panel  Validation  and  Evaluation  (months  55-84) 

Task  11.  Prepare  and  analyze  Panel  Development  Set  Markers  were  measured  in 
both  the  PDS  and  the  PVS  simultaneously.  Statistical  analyses  have  been  performed 
only  on  the  PDS  results  and  that  is  what  is  reported  below  for  tasks  12  and  13. 

Task  12.  Conduct  statistical  work  to  evaluate  candidate  biomarkers  (in  PDS) 

Task  12a.  Establish  cut  offs  for  normals  (complete).  The  following  markers  were 
evaluated:  COL1A1,  FN1,  HE4,  MIC1,  MMP7,  SPARC  and  TFF3.  Thresholds  were 
estimated  using  the  PDS  for  90%  and  95%  Specificity  levels. 

Task  12b.  Assess  single  marker  sensitivity  and  specificity  for  candidate  biomarkers 

(complete).  ROC  analyses  were  conducted  to  evaluate  sensitivity  and  specificity  of 
individual  markers  in  the  PDS. 

Task  12c.  Examine  stability  of  markers  over  time  within  and  between  subjects.  Effects 

of  covariates  on  marker  levels  as  well  as  the  within  and  between  woman  variance  of 
markers  were  estimated.  Generalized  estimating  equations  (GEE)  methods  were  used 
to  account  for  correlation  of  results  from  multiple  blood  collections  within  the  same 
women  at  separate  time  points.  Stability  over  time  and  effects  of  covariates  on  the 
marker  levels  were  evaluated  using  GEE  and  also  by  graphical  analyses. 

Task  12d  and  12e.  Using  augmented  logisitic  regression,  estimate  optimal  combinations 

of  markers  in  a  longitudinal  setting  and  use  an  ROC  curve  to  evaluate  the  contribution  of 

markers  to  mammography.  We  are  unable  to  complete  this  task  because  individual 
markers  did  not  have  sufficient  sensitivity  or  specificity  in  the  PDS  to  warrant  either 
investigating  the  benefit  of  combining  markers  in  a  panel  or  their  ability  to  complement 
mammography,  therefore  this  task  will  not  be  completed. 

Task  12f.  Provide  feedback  to  laboratory  scientists  via  CPAS  each  step  of  the  wav 

While  CPAS  was  useful  in  other  areas  of  this  study,  it  was  found  that  email  and  in- 
person  meetings  were  a  more  effective  tool  for  communicating  feedback  to  our  lab 
scientists. 

Task  13.  Prepare  and  analyze  Panel  Development  and  Validation  Sets  (PDS,  PVS) 

Task  13a.  FHCRC  Laboratory  technician  to  conduct  biomarker  assays  on  blinded 

samples  from  500  women  in  the  PDS  (complete). Seven  markers  (listed  above)  were 
measured  in  both  the  PDS  and  the  PVS  simultaneously. 

Task  13b.  Blinded  samples  given  to  laboratory  scientists  to  continue  refinement  of  new 

assays.  Due  to  the  heterogeneity  of  breast  cancer  subtypes,  preliminary  analyses  were 
conducted  in  the  PDS  which  was  comprised  of  specimens  from  all  cases  meeting  certain 
data  availability  requirements  regardless  of  subtype.  Laboratory  scientists  remained 
blinded  to  the  PDS  and  PVS  at  all  times,  so  this  set  was  un-useful  for  optimizing  assay 
conditions.  To  focus  laboratory  efforts  on  more  homogeneous  groups  of  specimens,  we 
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are  identifying  subgroups  of  CoE  specimens  that  may  be  relevant  to  specific  biomarker 
discovery  experiments  or  assay  optimization  using  other  sources  of  funding. 


Tasks  13c.  and  13d.  Data  to  Dr.  McIntosh  to  validate  the  ability  of  the  marker  panel  to 

discriminate  breast  cancer  from  non-cancerous  conditions  and  biomarker  validation 

team  to  evaluate  the  improvement  in  performance  attributable  to  marker  panel. 

Individual  markers  did  not  have  sufficient  sensitivity  or  specificity  in  the  PDS  to  warrant 
investigating  the  benefit  of  combining  markers  in  a  panel. 

Task  13e.  Prepare  reports  and  manuscripts  describing  performance  of  marker  panel.  In 

progress.  Manuscripts  are  currently  being  drafted  describing  the  PDS  and  PVS 
specimen  sets,  methods  for  assays  used  in  analyzing  these  sets,  and  findings  from  the 
PDS.  We  have  one  completed  manuscript  (Appendix  D)  described  in  more  detail  below 
under  key  research  accomplishments. 


Key  Research  Accomplishments 

■  Assays  for  83  potential  biomarkers  were  purchased  or  developed  and  tested  in  the 
BDS  by  the  TOR  laboratory  and  collaborators. 

■  Assays  for  seven  biomarkers  were  tested  in  the  Panel  Development  and  Validation 
Sets  by  the  TOR  laboratory. 

■  Due  to  a  lack  of  candidate  markers  available  for  evaluation  we  have  devoted  project 
resources  to  discovery  in  CoE  tissue  samples.  We  conducted  a  comparison  of  gene 
expression  profiles  in  tissues  collected  from  surgical  cohort  cases  and  from  healthy 
controls  undergoing  reduction  mammaplasty.  Potential  molecular  targets  for 
differential  expression  were  identified  by  a)  mining  publicly  available  expression  data 
and  b)  utilizing  a  commercial  PCR  array.  46  genes  with  differential  expression 
between  cases  and  controls  were  identified.  In  cases,  7  of  38  normal  tissues 
removed  from  a  distant  site  in  the  diseased  breast  exhibited  a  cancer-like  expression 
profile.  The  remaining  31  tissues  were  genetically  similar  to  the  profiles  from  samples 
collected  from  mammaplasty  controls.  This  suggests  it  may  be  possible  to  identify 
regions  of  ipsilateral  histologically  “normal”  breast  tissue  that  are  predisposed  to 
malignancy.  These  areas  could  then  be  targets  for  localized  treatment  for  prevention. 
Most  importantly,  12  genes  were  discovered  with  under-expression  in  cancers  linked 
to  aggressive  disease  with  poor  outcomes.  These  genes  were  not  previously 
associated  with  breast  cancer  and  have  the  potential  to  become  markers  of 
prognosis.  These  results  are  described  in  a  new  manuscript  (Appendix  D)  submitted 
September  23,  2009  to  the  Journal  American  Association  of  Cancer  Research  titled, 
“The  Discovery  of  Novel  Human  Breast  Cancer  Markers  with  Potential  for  Prognosis 
in  Early  Detection.” 


Reportable  Outcomes 

October  2008-September  2009 

1)  Miller  CP,  Lowe  KA,  Valliant-Saunders  K,  Kaiser  JF,  Mattern  D,  Urban  N,  Henke  M, 
Blau  CA.  Evaluating  Erythropoietin-Associated  Tumor  Progression  Using  Archival 
Tissues  from  a  Phase  III  Clinical  Trial.  Stem  Cells.  2009  Sep;27(9):2353-61.  (Appendix 
D) 


Previously  Reported  Outcomes  (October  2002-September  2008) 

1)  Scholler  N,  Garvik  B,  Quarles  T,  Jiang  S,  Urban  N.  Method  for  generation  of  in 
vivo  biotinylated  recombinant  antibodies  by  yeast  mating.  J  Immunol  Methods. 
2006  Dec  20;317(1-2):  132-43. 
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2)  Urban  ND,  Longton  GM,  Crowe  AD,  Drucker  MJ,  Lehman  CD,  Peacock  S,  Lowe 
KA,  Zeliadt  SB,  Gaul  MA.  Computer-Assisted  Mammography  Feedback  Program 
(CAMFP):  An  Electronic  Tool  for  Continuing  Medical  Education.  Academic 
Radiology.  2007  Sep;  14(9):  1036-42 

3)  Loch,  C.  M.,  Ramirez,  A.  B.,  Liu,  Y.,  Sather,  C.  L.,  Delrow,  J.J.,  Garvik,  B., 
Scholler,  N.,  Urban,  N.,  McIntosh,  M.  W.  and  Lampe,  P.  D.  Use  of  High  Density 
Antibody  Arrays  to  Validate  and  Discover  Cancer  Serum  Biomarkers.  Molecular 
Oncology.  December  2007.  Vol.  1,  Issue  3,  Pages  313-320. 

4)  Thorpe,  JD,  Duan  X,  Forrest  R,  Lowe  K,  Brown  L,  Segal  E,  Nelson  B,  Anderson 
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5)  Use  of  cancer-specific  yeast-secreted  in  vivo  biotinylated  recombinant  antibodies 
for  serum  biomarker  discovery;  Scholler,  N,  Gross,  JA,  Garvik,  B,  Wells,  L,  Liu, 

Y,  Loch,  CM,  Ramirez,  AB,  McIntosh,  MW,  Lampe,  PD,  Urban,  N.  Journal  of 
Translational  Medicine  2008,  6:41. 


6)  Use  of  yeast-secreted  in  vivo  biotinylated  recombinant  antibodies  (biobodies)  in 
bead-based  ELISA;  Scholler  N,  Lowe  K,  Bergan  L,  Kampani  A,  Ng  V,  Forrest  R, 
Thorpe  J,  Gross  J,  Garvik  B,  Drapkin  R,  Urban  N.  Clinical  Cancer  Research,  14 
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Conclusions 

We  have  quite  exhaustively  searched  without  success  for  an  early  detection  serum 
marker  for  breast  cancer.  We  have  learned  that  markers  that  initially  appear  promising 
are  shown  in  a  careful  validation  study  to  be  influenced  more  by  confounding  conditions 
than  by  malignancy.  Five  of  7  best  candidate  markers  are  influenced  by  age  of  the 
woman  (MIC1,  COL1A1,  HE4,  FN1  and  MMP7).  Similarly,  4  of  7  markers  are  affected 
by  current  OC  use  (TFF3,  FN1 ,  HE4  and  MIC1 ).  One  of  the  markers  (SPARC)  is 
affected  by  current  HRT  use  as  well  as  conditions  of  the  blood  draw  (surgical  vs.  clinic 
visit).  We  conclude  that  future  discovery  efforts  must  account  for  these  confounding 
factors  to  avoid  identification  of  markers  for  hormone  use  or  age  rather  than  malignancy. 
The  well-annotated  specimens  that  we  have  collected  will  be  useful  for  such  discovery 
efforts  as  well  as  for  further  validation  efforts.  Half  of  our  validation  set  remains  blinded 
so  that  it  can  be  used  when  markers  are  eventually  identified  that  are  worthy  of  inclusion 
in  a  panel. 
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**  Screened  by  SP  (Hanash).  Protocol  adapted  from:  Bignotti  E,  et  al.  Trefoil  factor  3:  a  novel  serum  marker  identified  by  gene  expression  profiling  in 
high-grade  endometrial  carcinomas.  Br  J  Cancer.  2008  September  2;  99(5):  768-773. 
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***  The  plate-based  assay  was  compared  to  the  bead-based  assay  in  the  OTS.  The  two  assays  correlated 
very  well.  Therefore,  the  bead-based  assay  was  used  for  the  BPVS. 
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ABSTRACT 


Mammography  reduces  breast  cancer  mortality  in  women  over  50,  but  its  perform¬ 
ance  is  often  criticized.  Finding  a  marker  (or  markers)  to  complement  mammography, 
thus  improving  its  sensitivity  and  specificity,  would  have  great  clinical  value.  Unfortu¬ 
nately,  no  serum  marker  has  proven  reliable  in  detecting  breast  cancer.  To  find  such  a 
marker,  we  utilized  two  sets  of  well-characterized  tissues:  one  from  breast  cancer  pa¬ 
tients  and  the  other  from  healthy  women  undergoing  reduction  mammaplasty.  We  iden¬ 
tified  over  46  differentially  expressed  genes  from  a  large  list  of  potential  targets  by  a) 
mining  publicly  available  expression  data  (identifying  134  genes  for  quantitative  PGR) 
and  b)  utilizing  a  commercial  PGR  array.  These  genes  warrant  further  investigation  as 
potential  blood  markers  for  early  detection.  As  a  second  finding,  when  histologically 
normal  breast  tissue  was  removed  from  a  distant  site  in  a  breast  with  cancer,  specimens 
from  7  of  38  patients  displayed  a  cancer-like  expression  profile,  while  the  remaining  31 
were  genetically  similar  to  the  reduction  mammaplasty  control  group.  This  suggests  that 
it  may  be  possible  to  identify  regions  of  ipsilateral  histologically  ‘normal’  breast  tissue 
that  are  predisposed  to  becoming  malignant.  This  might  lead  to  future  clinical  method¬ 
ologies  to  identify  normal-appearing  tissue  that  warrants  localized  treatment  for  preven¬ 
tion.  Most  importantly,  12  genes  showed  lower  expression  in  cancers  with  a  poor  out¬ 
come,  suggesting  their  use  as  prognostic  markers;  these  genes  were  also  under¬ 
expressed  in  a  large  number  of  controls. 


Unpublished  Data 


3 


INTRODUCTION 


While  mammography  reduces  breast  cancer  mortality  in  women  over  the  age  of  50 
(1 , 2)  there  is  controversy  regarding  the  degree  of  benefit  (3).  Most  critics  agree  that 
mammography  has  less  value  in  women  under  50  (4)  due  to  lower  sensitivity  and  a  high 
rate  of  false  positives  (5).  Existing  serum  markers  (CA  15-3,  CEAand  CA  27-29)  have 
both  low  sensitivity  and  specificity.  Although  they  may  be  useful  for  monitoring  treatment 
in  patients  with  advanced  disease  (6),  these  markers  are  not  helpful  in  the  early  detec¬ 
tion  of  breast  cancer.  There  is  thus  a  pressing  need  for  novel  markers  that  can  be  used 
independently  or  that  can  complement  mammography  in  early  detection. 

In  1999  we  successfully  used  a  transcript-based  discovery  approach  to  identify  early 
detection  markers  for  ovarian  cancer  (7).  Following  the  5  phases  of  screening  biomarker 
development  proposed  by  Pepe  etal.  (8),  HE4,  the  product  of  the  human  epididymis 
gene  WFDC2,  was  developed  into  a  serum  assay  (9,  10)  that  is  now  approved  for  re¬ 
mission  monitoring  of  ovarian  cancer.  It  is  being  evaluated  for  its  potential  role  in 
screening  and  is  considered  a  successful  product  of  translational  biomarker  research. 

Because  breast  cancer  marker  research  has  focused  mainly  on  prognosis  (6),  there 
are  few  comprehensive  studies  to  identify  early  detection  markers.  We  therefore  relied 
on  our  previously  successful  approach  using  gene  discovery  by  cDNA  microarray  fol¬ 
lowed  by  expression  validation  through  polymerase  chain  reaction  (PCR),  ranking  of  po¬ 
tential  markers  and  the  development  and  testing  of  serum  assays.  For  breast  cancer,  a 
large  body  of  research  already  available  in  the  public  domain  allowed  us  to  forego  our 
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own  microarray  work;  instead  we  mined  publicly  available  expression  data  in  tissues  of 
breast  cancer  and  normal  healthy  controls. 

One  of  the  lessons  learned  from  previous  gene  discovery  experiments  is  the  impor¬ 
tance  of  having  high-quality  appropriately  preserved  specimens  and  matching  patient 
data.  We  therefore  spent  considerable  effort  on  the  accrual  of  needed  tissues.  In  close 
collaboration  with  participating  surgeons  and  pathologists,  we  were  able  to  collect 
specimens  in  the  operating  and  gross  rooms  where  they  were  processed  with  as  little 
delay  as  possible,  thus  minimizing  variability.  In  addition,  routine  clinical  gross  and  mi¬ 
croscopic  tissue  analysis  was  complemented  with  routine  research  histological  exami¬ 
nation  on  the  actual  tissue  piece  that  was  later  used  for  expression  analysis.  Breast  tis¬ 
sues  from  breast  cancer  patients  were  then  compared  to  those  from  healthy  individuals 
While  normal  tissue  adjacent  to  the  cancer  is  relatively  easy  to  obtain,  we  feared  that  in 
cancer  patients,  cancer-related  pathways  may  be  perturbed  in  these  tissues  (11).  We 
therefore  used  normal  tissue  from  breast  reduction  mammaplasties  as  controls. 

The  identified  genes  have  the  potential  to  become  markers  for  molecular  pathology 
(e.g.  aiding  the  pathologist  deciding  about  the  malignant  potential  of  a  suspicious- 
looking  tissue  section),  prognosis,  guiding  therapy  and  for  early  diagnosis  (as  proteins 
in  serum),  in  a  panel,  potentially  complementing  mammography. 
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METHODS 


Patients  and  tissues.  Patients  were  enrolled  at  Swedish  Medical  Center,  Seattle  and 
Cedars  Sinai  Medical  Center,  Los  Angeles.  Patients  were  consented  before  surgery  and 
administered  a  health  status  and  family  history  questionnaire.  Hospital  records  were 
used  for  follow-up.  Cancer,  ipsi-  and  contralateral  normal  tissues  were  obtained  at 
Swedish  Medical  Center  from  44  patients  (including  7  with  neoadjuvant  treatment)  un¬ 
dergoing  mastectomy.  Tissues  from  breast  reduction  surgeries  (20  patients)  were  ob¬ 
tained  from  private  practices  in  Los  Angeles  and  histologically  analyzed  to  exclude  any 
with  abnormalities.  In  all  cases,  tissue  was  obtained  and  processed  by  research  per¬ 
sonnel  in  the  operating  or  gross  room  and  frozen  within  1  hour  of  surgery.  The  frozen 
tissue  available  for  research  (mean:  150  g,  range:  20-500  g)  was  split  into  several 
pieces  of  which  one  was  fixed  in  formalin,  embedded  in  paraffin  and  used  for  histologi¬ 
cal  examination  by  a  pathologist.  The  other  pieces  were  kept  frozen  and  used  for  RNA 
extraction.  Only  tumor  samples  with  more  than  70%  tumor  cells,  excluding  in  s/Yu  dis¬ 
ease,  and  normal  samples  with  less  than  60%  fat  were  included.  In  the  end,  gross  and 
microscopic  clinical  evaluation  matched  the  histology  of  the  actual  tissue  piece  being 
analyzed  in  50%  of  the  cancer  tissues  and  67%  of  tissues  with  normal  histology.  Patient 
characteristics  are  reported  in  the  supplemental  Table  SI . 

LevelsDB.  Over  the  last  10  years  a  database  has  been  compiled  (LevelsDB)  that  holds 
gene  and  protein  expression  information  from  over  134  publications  (90%  transcript-, 
10%  protein-based)  and  21 ,890  genes.  LevelsDB  was  created  to  facilitate  the  discovery 
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of  markers  for  cancer  detection,  and  emphasis  was  given  to  publications  with  data  for 
normal  controls  as  well  as  cancers.  LevelsDB  uses  the  GenelD  as  an  identifier  (12) 
which  is  related  to  the  gene  symbols  governed  by  the  Guidelines  for  Human  Gene  No¬ 
menclature  (13).  The  datasets  were  extremely  variable  in  the  way  they  recorded  ex¬ 
pression,  ranging  from  a  simple  list  of  proteins  to  raw  cDNA  microarray  expression  data. 
As  a  consequence,  LevelsDB  forewent  exact  representation  of  original  expression  val¬ 
ues.  Instead,  it  recorded  whether  or  not  a  transcript  or  protein  was  present  in  a  given 
tissue,  whether  it  had  tumor-to-normal  ratios  above  a  factor  2  or,  in  the  case  of  cDNA 
microarray-based  expression  data,  whether  a  mRNA  was  expressed  at  low,  medium  or 
high  levels  (threshold  defined  by  lx,  3x,  and  lOx  the  median  expression  across  all  tis¬ 
sues).  LevelsDB  also  contains  data  on  subcellular  localization.  The  datasets  used  in 
LevelsDB  are  listed  in  the  Supplement. 

RNA  extraction  and  real-time  PCR  (SYBR).  Snap-frozen  tissues  were  homogenized 
with  a  TissueLyser  (Qiagen,  Valencia,  CA)  in  Trizol  (Invitrogen,  Carlsbad,  CA).  Total 
RNA  was  then  extracted  using  RNeasy  with  DNAse  I  (Qiagen).  RNA  quality  was  meas¬ 
ured  by  Agilent  2100  Bioanalyzer  (Agilent,  Santa  Clara,  CA)  to  have  a  28S/18S  RNA 
ratio  of  1  ±  0.2  and  by  spectrophotometer  with  an  OD260/OD280  ratio  >  1 .6).  Mean  RNA 
yield  was  90  ±  1 30  ng  per  mg  of  tissue.  Copy  DNA  was  reverse  transcribed  from  5  pg  of 
total  RNA  (Superscript  III  kit,  Invitrogen)  with  oligo-dT  priming,  of  which  50  ng  were  used 
as  template  in  a  15  pi  PCR.  Copy  DNA  was  amplified  using  the  SYBR  green  kit  (Invitro¬ 
gen)  on  a  7900HT  Fast  Real-Time  PCR  System  (Applied  Biosystems,  Foster  City,  CA). 
Each  384-well  plate  contained  aliquots  of  all  cDNAs  used  during  the  experiment  as  well 
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as  a  standard  made  with  testes  cDNA  in  5  dilutions  (1:1,1 :3,  1 :9,  1  \27,  1 :81 )  as  dupli¬ 
cates,  amplified  with  primers  for  ACTB  (all  cDNAs  were  sub-aliquoted  and  stored  at 
-20 °C  for  consistency).  This  allowed  us  to  transform  the  logarithmic  cycle  threshold  (CT) 
values  into  linear  values.  Reactions  were  performed  in  duplicate  or,  if  samples  did  not 
amplify  well  or  if  the  correlation  between  the  runs  was  poor,  in  triplicate.  All  PCRs  were 
normalized  by  the  averaged  expression  of  three  housekeeping  genes  ACTB,  B2M  and 
TMED10  run  in  triplicate.  Primer  sequences  are  listed  in  the  Supplement. 

OpenArray  transcript  expression.  Two  micrograms  of  total  RNA  were  reverse  tran¬ 
scribed  using  the  High  Capacity  cDNA  RT  Kit  (Applied  Biosystems)  with  random  hex- 
amer  primers.  All  cDNA  was  analyzed  on  the  Cancer  Pathways  OpenArray  system 
(BioTrove,  Woburn.  MA)  using  the  Fast  Start  DNA  SYBR  Green  kit  (Roche,  Nutley.  NJ). 
Four  cDNA  samples  were  tested  simultaneously  per  plate,  with  16  samples  per  run.  CT 
values  were  transformed  into  linear  values  by  calculating  1 .735  ^  (32  -  CT).  Values  were 
normalized  by  the  mean  of  18  housekeeping  genes.  The  expression  values  from  the 
OpenArray  platform  and  the  qPCR  (SYBR)  were  mean-normalized  to  allow  for  compari¬ 
son  across  the  two  platforms. 

Cluster  analysis.  Unsupervised  hierarchical  clustering  was  performed  using  Spearman 
rank  correlation  as  similarity  metric  and  centroid  linkage  as  clustering  method.  PCR  ex¬ 
pression  values  were  averaged  between  duplicate  runs,  mean-normalized  and  entered 
into  the  Cluster  program  (14)  as  log2  values.  The  tree  was  visualized  using  Java  Tree- 
view  (15). 
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RESULTS  AND  DISCUSSION 


Determination  of  components  of  variation  of  normal  tissue.  The  primary  goal  was  to 
identify  genes  that  discriminate  between  normal  breast  tissue  and  invasive  carcinoma  of 
the  breast.  Since  normal  breast  tissue  was  to  be  used  as  a  reference,  a  threshold 
needed  to  be  determined  above  which  a  gene  would  be  labeled  as  differentially  ex¬ 
pressed.  Therefore  we  assessed  the  variability  in  gene  expression  within  normal  breast 
tissue;  to  our  knowledge,  this  has  not  preciously  been  well  studied.  Quantitative  PCR 
(SYBR)  expression  analysis  was  performed  for  18  genes  on  an  average  of  3.5  tissue 
slices  per  breast  from  10  women  with  bilateral  reduction  mammaplasty.  The  PCRs  were 
normalized  by  their  median  and  the  duplicate  runs  were  averaged.  Table  1  shows  over¬ 
all  gene  expression  variability  (as  standard  deviation)  and  which  fraction  of  it  is  attribut¬ 
able  to  the  component  variabilities  of  woman-to-woman  (averaging  64%  ±9%),  left-to- 
right  breast  (averaging  6%  ±3%)  and  within-breast  (averaging  30%  ±9%).  These  per¬ 
centages  represent  the  overall  magnitude  of  the  different  sources  of  variation  as  deter¬ 
mined  by  ANOVA  analyses.  As  expected,  between-woman  variability  is  greatest,  twice 
that  of  within-breast  variability,  implying  that  the  largest  source  of  variation  is  heteroge¬ 
neity  among  women.  The  smallest  source  of  variation  is  the  between-breast  component, 
implying  that  normal  material  from  a  contralateral  breast  is  a  good  surrogate  for  normal 
material  from  the  affected  breast.  Approximately  30%  of  overall  variation  was  explained 
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by  variation  in  the  molecular  behavior  of  different  biopsies  of  the  same  breast  (note  the 
assays  were  performed  in  duplicate,  and  the  coefficient  of  variation  (CV)  of  the  assay 
was  small  compared  to  each  of  these  components  of  variation  and  so  can  be 
ignored).  Because  each  value  was  median-centered,  the  standard  deviation  also  repre¬ 
sents  a  surrogate  for  CV  in  the  population;  it  is  scaled  to  the  typical  expression  levels. 
Six  of  the  18  genes  have  standard  deviations  above  1  (COL1 A2,  CTGF,  GATA3,  LYZ, 
MUC1  and  WFDC2)  that  could  be  related  to  a  spotty  expression  pattern  (e.g.  only  a  few 
cells  in  a  tissue  express  the  transcript).  Of  note,  this  elevated  variability  is  unrelated  to 
within-breast  variability.  For  WFDC2,  the  most  variable  gene,  the  standard  deviation  is 
2.73.  Therefore,  for  subsequent  cancer-to-normal  ratios  a  threshold  of  the  mean  plus  3 
standard  deviations  was  chosen  for  all  genes. 

Database  mining  identifies  genes  with  differential  expression  between  breast  cancer 

and  normal  tissues.  Despite  a  large  body  of  research  on  gene  and  protein  expression  in 
breast  cancer,  few  studies  include  healthy  controls.  In  those  that  do  (16-25),  the  normal 
tissues  are  often  not  well  characterized.  Publications  reporting  expression  only  in  breast 
cancer  and  not  in  healthy  controls  are  still  useful  for  a  cancer-to-normal  comparison 
since  the  data  can  potentially  be  matched  with  those  from  other  sources  using  healthy 
normal  tissues.  These  and  additional  expression  data  were  compiled  in  a  database 
(LevelsDB)  that  was  then  mined  starting  with  genes  contained  in  breast  cancer  data 
sets  (4405  genes),  followed  by  removal  of  genes  and  proteins  whose  subcellular  local¬ 
ization  makes  the  protein  unlikely  to  be  found  in  the  blood  stream  by  non-necrotic  proc¬ 
esses  (nuclear,  mitochondrial  and  ribosomal),  leaving  3271 .  In  a  next  step,  housekeep- 


Unpublished  Data 


10 


ing  genes  were  excluded,  followed  by  removal  of  genes  with  high  levels  of  expression  in 
normal  tissues  of  organs  with  strong  blood  contact  (kidney,  lungs,  liver,  heart  and  pan¬ 
creas).  The  remaining  genes  and  proteins  were  filtered  for  those  with  low  expression  in 
normal  tissues.  The  lack  of  truly  normal  breast  expression  data,  except  for  data  from 
one  normal  tissue  coming  from  a  breast  with  cancer  (26)  required  the  use  of  other  nor¬ 
mal,  especially  epithelial  tissues  for  subtractive  comparison,  contained  in  six  datasets  in 
LevelsDB.  The  last  reduction  step  resulted  in  150  genes  and  proteins  of  which  44  were 
likely  to  be  secreted  or  membrane  bound  which  makes  them  ideal  as  a  blood  marker. 
These  were  augmented  by  90  genes  for  which  literature  review  suggests  a  potential  role 
as  breast  cancer  markers,  resulting  in  134  genes  (Table  2)  for  subsequent  expression 
analysis  by  PCR.  References  for  these  additional  genes  are  listed  in  the  Supplement. 

Expression  validation  results  in  46  differentially  expressed  genes.  To  identify  from  the 
134  genes  those  with  the  ability  to  discriminate  between  normal  and  malignant  breast 
tissue,  PCR  expression  analysis  was  performed  on  93  tissues  (24  invasive  cancers,  38 
ipsilateral  normals,  3  contralateral  normals,  28  tissues  from  breast  reduction  surgery) 
from  64  women  (Supplemental  Table  SI ).  The  cDNA  was  oligo-dT  primed.  A  comma- 
delimited  file  with  the  expression  data  is  available  in  the  Supplement.  PCR  results  for  8 
genes  were  not  conclusive  even  after  2  repeats  and  the  genes  were  removed  from  fur¬ 
ther  analysis  (supplemental  Table  S2).  Of  the  126  remaining  genes,  67  discriminated 
between  the  25  cancer  tissues  and  the  28  mammaplasty  controls  with  >20%  of  the  can¬ 
cers  and  <5%  of  the  controls  above  or  below  threshold  (Table  2,  and  supplemental  Ta¬ 
ble  S2). 
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After  completion  of  the  PGR  work,  a  new  PCR-based  technology  (OpenArray  by  Bio- 
Trove,  Woburn,  MA)  had  come  to  the  market  that  allowed  for  more  rapid  gene  expres¬ 
sion  analysis  (27).  This  technology  was  used  to  confirm  the  expression  of  a  subset  of 
the  genes  on  a  subset  of  the  tissues.  BioTrove’s  Cancer  Pathways  OpenArray  plate  in¬ 
cluded  primer  pairs  specific  for  606  genes  associated  with  DNA  repair,  angiogenesis, 
cell  adhesion,  apoptosis,  cell  cycle  and  many  genes  encoding  kinases.  Of  the  134 
genes  from  database  mining,  41  overlapped  with  the  OpenArray  panel  which  was  suit¬ 
able  for  confirmation  of  about  30%  of  the  original  results.  Out  of  the  94  tissues  that  were 
originally  tested,  13  were  randomly  selected  from  the  24  cancer  tissues  and  likewise  9 
from  the  28  mammaplasty  controls.  Applying  the  same  criteria  as  for  the  original  set,  the 
OpenArray  analysis  of  the  reduced  set  selected  the  same  differentially  expressed  genes 
as  the  analysis  of  the  original  set  (supplemental  Table  S3).  Both  amplifications  had  good 
correlation  (supplemental  Table  S4)  with  an  averaged  coefficient  of  variation  of  38% 
(12%-71%),  even  considering  the  differences  between  both  methods  (oligo-dT  versus 
random  priming;  differences  in  primer  sequences). 

Unsupervised  cluster  analysis  of  the  PGR  data  shows  that  46  of  the  67  genes  have 
the  power  to  separate  cancer  tissues  from  controls,  thus  confirming  the  validity  of  our 
approach  (Figure  1 ,  red  and  green  bars  in  the  left).  These  genes  warrant  further  investi¬ 
gation  as  potential  blood  markers  for  early  detection. 

The  mammaplasty  control  patients,  seen  in  Los  Angeles,  were  on  average  15  years 
younger  than  the  patients  with  tumor  or  ipsi-  and  contralateral  normal  tissue,  seen  in 
Seattle.  While  differences  in  institution  and  age  could  potentially  introduce  bias,  the  in¬ 
terspersing  of  the  normal  tissues  in  the  control  cluster  suggests  that  this  is  not  the  case 
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(Figure  1).  This  can  be  attributed  in  part  to  the  strict  adherence  to  identical  specimen 
collection  protocols  at  both  sites.  No  clustering  behavior  was  found  based  on  other  fac¬ 
tors  listed  in  the  supplemental  Table  S1,  including  mammography  (BI-RADS)  score  and 
breast  density  at  the  last  mammogram  before  diagnosis,  lymph  node  positivity,  tumor 
size,  number  of  foci,  stage  and  grade. 

Comparison  to  previously  published  results.  Comparing  the  present  gene  expression 
results  to  those  previously  published  is  difficult  because  prior  studies  rarely  used 
healthy  normal  controls.  While  the  terms  “over-”  and  “under-expression”  are  common  in 
the  breast  cancer  literature,  they  most  often  refer  to  expression  of  one  cancer  state  rela¬ 
tive  to  another  or  to  a  cell  line,  and  not  relative  to  a  healthy  normal  control.  The  three 
breast  cancer  publications  that  used  mammaplasty  tissue  as  controls  confirm  the  ex¬ 
pression  pattern  of  the  metalloproteinases  (28),  YWHAZ  (11),  ERBB3  and  ERBB4  (29). 
Furthermore  COL1 A1  and  COL1 A2  over-expression  was  also  seen  in  a  meta-analysis 
of  13  publications  comparing  breast  cancer  to  largely  undefined  and  probably  ipsilateral 
normal  tissue  (30).  This  gives  credence  to  the  observed  expression  pattern  of  the  re¬ 
maining  genes. 

Identification  of  additional  differentially  expressed  genes.  The  OpenArray  Cancer  Path¬ 
ways  chip  contained  606  genes  of  which  41  were  used  for  confirmation  of  the  PCR  re¬ 
sults.  Unsupervised  cluster  analysis  of  the  remaining  565  genes  in  the  13  cancers  and  9 
control  tissue  results  in  a  clear  distinction  between  over-  and  under-expressed  genes. 
The  OpenArray  PCR  was  not  duplicated  and  thus  the  results  have  greater  error  mar- 
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gins.  Therefore  more  stringent  filtering  conditions  were  applied  than  for  the  original 
PGR:  genes  with  differential  expression  in  more  than  30%  of  the  tumors  at  a  tumor-to- 
normal  ratio  of  1 .2  were  removed  from  the  dataset.  Of  the  resulting  1 02  genes,  88% 
were  found  to  be  under-expressed  in  the  tumor  tissues  (see  supplemental  Figure  S1), 
as  indicated  by  the  negative  %CV  numbers  in  Table  3.  Once  confirmed  in  their  expres¬ 
sion,  it  is  likely  that  some  of  these  1 02  genes  will  be  added  to  the  46  potential  markers. 

The  high  number  of  under-expressed  genes  contrasts  sharply  with  above  results 
where  over-  and  under-expressed  genes  are  equally  represented.  The  Cancer  Path¬ 
ways  genes  had  been  selected  based  on  general  cancer  literature  which  includes  a 
large  number  of  within-cancer  and  cell  line  experiments.  Our  data  mining  on  the  other 
hand  focused  on  breast  cancer,  normal-to-cancer  differences  and  extracellular  expres¬ 
sion.  Hence,  the  former  contains  a  larger  number  of  intracellular,  regulatory  proteins. 
Interestingly,  the  differentially  expressed  genes  in  both  sets  are  enriched  for  connective 
tissue  genes,  suggesting  that  alteration  in  the  composition  of  the  connective  tissue  is  an 
important  factor  in  cancer  formation. 

Histologically  normal  tissues  from  an  affected  breast  can  demonstrate  molecular  pre¬ 

disposition  to  cancer.  Unsupervised  cluster  analysis  of  the  original  PGR  data  placed  7  of 
the  38  ipsilateral  normal  tissues  in  the  cancer  cluster  (Figure  1).  The  difference  between 
these  and  the  remaining  31  ipsilateral  tissues  cannot  be  correlated  with  tissue  or  patient 
characteristics  (supplemental  Table  SI).  Because  BRGA status  was  not  recorded,  the 
study  does  not  address  any  possible  link  between  mutation  and  a  cancer-like  gene  ex¬ 
pression  pattern  in  ipsilateral  normal  tissue.  Another  explanation  is  a  positional  effect 
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related  to  distance  between  lesion  and  the  site  of  normal  tissue  collection  or  that  index 
lesion  and  ipsilateral  normal  tissue  come  from  the  same  lobe  (31).  Consequently,  a 
normal  tissue  from  an  unaffected  contralateral  breast  should  display  a  normal-like  gene 
expression  pattern.  Indeed,  of  the  three  contralateral  normal  tissues,  the  two  coming 
from  a  breast  without  evidence  of  cancer  were  found  in  the  normal  cluster  and  the  third, 
from  a  breast  with  malignancy,  grouped  with  the  cancers. 

Tripathi  etal.  compared  normal  tissue  from  mammaplasty  to  ipsilateral  normal  and 
breast  tissue  with  in  situ  disease.  They  found  that  global  gene  expression  abnormalities 
exist  in  both  normal  epithelium  of  breast  cancer  patients  and  early  cancers  (11).  The  re¬ 
sults  presented  here  go  one  step  further  by  including  same-patient  invasive  tissues. 

This  leads  to  the  conclusion  that  ipsilateral  normal  tissues  with  cancer-like  gene  expres¬ 
sion  are  molecularly  predisposed  to  cancer.  To  validate  these  findings,  BRCA  status  of 
the  patient  and  positional  information  of  the  tissue  pieces  harvested  from  a  breast  would 
need  to  be  recorded.  Our  tissue  collection  protocol  has  been  altered  accordingly. 

Identification  of  novel  prognostic  markers.  Unsupervised  cluster  analysis  of  the  original 
PGR  data  resulted  in  a  cancer  cluster  with  two  sub-clusters,  one  enriched  for  patients 
with  cancers  of  the  luminal  subtype  and  one  of  the  basal  subtype,  as  defined  by  hor¬ 
mone  receptor  and  HER2  expression  (32)  (Figure  1).  The  composition  of  these  two  sub¬ 
clusters  is  summarized  in  supplemental  Table  S5.  The  luminal-like  sub-cluster  is  defined 
by  over-expression  of  the  luminal  markers  ESR1 ,  PGR  and  GATA3  (33-35)  in  all  of  its 
tissues  and  by  the  over-expression  of  CTGF,  MMP2,  AR,  CFB,  CD44,  EPOR,  CDKN1 B, 
ETAA1,  FGFR2,  TNFRSF10B,  ERBB4,  SCUBE2,  FOXA1  and  MUC1  in  tissues  from 
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cancers  with  lobular  or  mixed  ductal-lobular  histology.  Except  for  AR  (36)  and  ESR1 
(37),  none  of  these  genes  can  be  linked  to  lobular  cancer  histology,  in  particular  the 
comparison  by  Zhao  etal.  of  ductal  and  lobular  carcinomas  (38).  The  role  of  GATA3  for 
the  maintenance  of  the  luminal  phenotype  has  been  reviewed  by  TIsty,  particularly  the 
correlation  of  low  expression  of  GATA3  and  low  estrogen  receptor  alpha  (39)  which  the 
present  data  confirm.  The  basal-like  sub-cluster,  characterized  by  the  under-expression 
of  these  genes,  is  enriched  for  triple-negative  (hormone  receptor  and  HER2-negative) 
cancers  and  contains  all  cancer  tissues  of  patients  that  are  deceased  (black  dots)  or 
have  recurred  (orange  dots).  Lobular  breast  carcinomas  are  known  to  be  associated 
with  better  survival  than  ductal  carcinomas  (37,  40)  and  triple-negative  breast  cancers 
have  been  associated  with  poor  prognosis  (41).  Also,  in  a  meta-analysis  of  published 
breast  cancer  cDNAdata,  low  GATA3  expression  is  linked  with  poor  clinical  outcome 
(42).  The  difference  between  these  cancer  sub-clusters  could  therefore  be  attributed  to 
the  aggressiveness  of  the  disease.  While  some  of  these  genes  have  been  linked  to  the 
basal  subtype  (16,  19)  and  some  are  now  being  used  to  predict  disease  outcome,  in¬ 
cluding  SCUBE2  in  Oncotype  DX  (43)  and  Mammaprint  (44),  the  majority  of  them  may 
constitute  a  novel  group  of  genes  that  predict  outcome  and/or  inform  treatment. 

Interestingly,  the  under-expressed  genes  that  define  the  basal-like  cluster  are  under¬ 
expressed  in  most  controls,  suggesting  that  these  aggressive  cancers  may  be  difficult  to 
distinguish  from  normal  breast  tissue  at  the  molecular  level. 

Conclusions.  Our  results  suggest  that  many  of  the  genes  commonly  attributed  to  cancer 
pathways  are  expressed  at  lower  levels  in  breast  cancers  than  in  normal  breast  tissue. 
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confirming  and  further  extending  results  by  Tripathi  etal.  (11).  Furthermore,  the  genes 
that  predict  aggressive  phenotype  in  between-cancer  comparisons  are  not  differentially 
expressed  between  aggressive  cancers  and  healthy  controls.  If  serum  assays  com¬ 
monly  measure  the  increase  of  a  marker  rather  than  its  absence  in  cancer,  our  findings 
would  help  explain  the  current  lack  of  suitable  blood  markers  for  breast  cancer,  particu¬ 
larly  in  patients  with  poor  prognosis  malignancies. 

In  spite  of  these  shortcomings,  our  work  resulted  in  the  identification  of  a  number  of 
differentially  expressed  genes,  including  12  related  to  aggressive  disease,  a  minimum  of 
46  discriminating  between  cancer  and  controls,  of  which  some  (MMP12,  S100A7  and 
SPP1)  are  over-expressed  in  aggressive  cancer.  Those  coding  for  proteins  that  are 
readily  shed  may  be  of  greatest  interest  for  serum  marker  evaluation,  and  markers  that 
are  over-expressed  but  not  shed  may  be  the  most  attractive  for  tumor-specific  localiza¬ 
tion,  including  prognosis. 
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TABLES 


Table  1.  Variability  in  gene  expression 


Marker 

StDev 

Between 

women 

Between 

breasts 

within 

woman 

Within 

breast 

ASPN 

0.88 

75% 

3% 

21% 

CAV1 

0.89 

60% 

2% 

38% 

CFB 

0.74 

46% 

8% 

46% 

COL1A2 

1.13 

76% 

10% 

15% 

CTGF 

1.07 

67% 

6% 

28% 

GATA3 

1.61 

60% 

4% 

36% 

LETMD1 

0.46 

79% 

9% 

11% 

MGST1 

0.94 

59% 

4% 

37% 

LYZ 

1.08 

66% 

8% 

23% 

MMP2 

0.63 

50% 

10% 

39% 

MUC1 

2.34 

62% 

4% 

34% 

SPARC 

0.71 

74% 

3% 

22% 

SUMF2 

0.39 

55% 

7% 

35% 

TIMP1 

0.69 

65% 

9% 

26% 

TIMP2 

0.49 

63% 

7% 

29% 

TIMP3 

0.60 

67% 

3% 

30% 

WFDC2 

2.73 

61% 

4% 

35% 

YWHAZ 

0.37 

64% 

5% 

30% 

Mean 

64% 

6% 

30% 

StDev 

9% 

3% 

9% 

Variability  in  gene  expression  across  all  tissue  pieces  (StDev) 
and  as  percentage  by  each  of  the  three  fractions.  The  last  two 
rows  display  the  mean  and  standard  deviation  of  the  1 8  genes. 


Table  2.  71  of  126  genes  discriminate  cancers  from  controls 


Gene  Symbol 

Result 

Gene  Symbol 

Result 

Gene  Symbol 

Result 

ADAM  12* 

EPOR 

V 

PEBP4 

AGR2* 

ERBB3* 

PGR 

V 

AKT1 

ERBB4 

V 

PIK3CA 

AM  BP* 

ESR1 

A 

PIP 

ANGPT2* 

V 

ETAA1* 

V 

PLAUR 

A 

APOL1* 

V 

FGFR2* 

V 

PRLR* 

AR 

V 

FN1* 

A 

PROCR 

V 

ASPN* 

A 

FOXA1 

A 

PSMA5* 
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Table  2.  71  of  126  genes  discriminate  cancers  from  controis 


Gene  Symbol 

Result 

Gene  Symbol 

Result 

Gene  Symbol 

Result 

BGN* 

A 

GATA3 

A 

PTPN1* 

BIRC5 

A 

GDF15 

PVRL4 

BRCA2 

HOXB7 

V 

S100A7 

A 

BRMS1 

IFIT1* 

S100B* 

V 

BUB1 

A 

IGF2 

SCGB2A1 

Cl  Sorts* 

KRT20 

SCUBE2* 

V 

CALB2 

KRT7 

SDC1 

CAV1 

V 

LCN2 

SFRP1 

V 

CCNE1 

A 

LETMD1 

V 

SFRP2* 

CD274 

A 

LPAR3 

SNIP 

CD44 

V 

LRRC15* 

A 

SPARC 

V 

CDH1 

LTF* 

SPP1 

A 

CDKN1B* 

V 

LYZ* 

STC2* 

CDX2 

MGST1 

V 

SUMF2* 

CFB* 

A 

MIF 

THBS2 

COL11A1* 

A 

MMP1 

A 

TIMP1 

A 

COL1A1* 

A 

MMP10 

A 

TIMP2 

V 

COL1A2* 

A 

MMP11 

A 

TIMP3 

V 

COL3A1* 

MMP12 

A 

TIMP4 

V 

COL5A1* 

A 

MMP13 

A 

TK1 

A 

COL5A2* 

V 

MMP14 

TM9SF2* 

COL6A3* 

MMP16 

TNFRSF10B 

V 

COLSA1* 

MMP2 

V 

TNN 

V 

COMP* 

A 

MMP20 

TOP2A 

A 

CSNK2A1 

MMP3 

TP53 

CTGF 

V 

MMP7 

V 

TRPS1 

CTHRC1* 

A 

MMPS 

TTF1 

CYP4B1* 

V 

MMP9 

VCAN* 

A 

CYR61 

V 

MSLN 

VEGFA 

DEFA1 

MUC1 

A 

VTCN1 

DEFA3 

NES 

V 

WFDC2 

A 

ECM1* 

OAS1* 

WT1 

A 

EGFR 

V 

OAS2* 

A 

XBP1 

EPO 

PEBP1 

V 

YWHAZ 

1 26  genes  found  by  mining  of  expression  data  and/or  LevelsDB  (asterisk). 
Thresholds  for  cancers  and  controls  were  determined  by  expression  in  the  28 
normal  mammaplasty  tissues  as  mean  +3  SD  for  genes  with  over-expression  in 
the  cancers  and  as  below  the  minimum  for  genes  with  under-expression  in  the 
cancers.  Over-  (A)  or  under-  (V)  expression  in  cancer  tissue  with  >20%  of  the 
cancers  and  <5%  of  the  controls  above  or  below  threshold. 
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Table  3.  Genes  resulting  from  the  OpenArray  analysis 


CSF1 

-100% 

HADHA 

-77% 

TP53I3 

-62% 

SLC2A3 

-38% 

EGR1 

-100% 

MCC 

-77% 

ANPEP 

-54% 

SMPD1 

-38% 

FLT1 

-100% 

RELA 

-77% 

BAG1 

-54% 

TGFBI 

-38% 

FOS 

-100% 

BTG2 

-69% 

ILK 

-54% 

BNIP3 

-31% 

NIDI 

-100% 

CNBP 

-69% 

ING1 

-54% 

CBLB 

-31% 

SEPP1 

-100% 

DHX8 

-69% 

PECAM1 

-54% 

DEGS1 

-31% 

SRPX 

-100% 

EPHA2 

-69% 

PIR 

-54% 

EGLN1 

-31% 

TGFBR2 

-100% 

GNB2L1 

-69% 

RIPK1 

-54% 

ETV6 

-31% 

TGFBR3 

-100% 

IGFBP4 

-69% 

SFRS7 

-54% 

FOSL2 

-31% 

TIE1 

-100% 

NDRG1 

-69% 

TSG101 

-54% 

LDHA 

-31% 

VIM 

-100% 

PAQR3 

-69% 

CAPNS1 

-46% 

NR1D1 

-31% 

HYAL1 

-92% 

PEA15 

-69% 

CHPT1 

-46% 

PRKCD 

-31% 

PPARG 

-92% 

PFDN5 

-69% 

EIF5 

-46% 

PRNP 

-31% 

RAB5A 

-92% 

RAF1 

-69% 

GTF2I 

-46% 

SORT1 

-31% 

SEMA3C 

-92% 

RAP1A 

-69% 

JAK1 

-46% 

TRADD 

-31% 

SPRY2 

-92% 

SKI 

-69% 

MDM2 

-46% 

EVL 

31% 

CCND3 

-85% 

SP1 

-69% 

MLLT10 

-46% 

HSPB1 

31% 

CDC42BPA 

-85% 

STK3 

-69% 

SELENBP1 

-46% 

KIF3B 

38% 

CIRBP 

-85% 

CSF1R 

-62% 

ATP5B 

-38% 

PKM2 

38% 

F0X01 

-85% 

GAS6 

-62% 

AXL 

-38% 

RFC4 

38% 

ITGB3 

-85% 

NF2 

-62% 

CTNNA1 

-38% 

RARA 

46% 

PTEN 

-85% 

PECI 

-62% 

DON 

-38% 

RAD21 

54% 

RHOB 

-85% 

PRKCE 

-62% 

EXT1 

-38% 

PRC1 

77% 

TYR03 

-85% 

STAT3 

-62% 

HRB 

-38% 

SKIL 

77% 

ABL1 

-77% 

TAF1 

-62% 

PPP2R5A 

-38% 

CD59 

-77% 

TJP1 

-62% 

PRKD2 

-38% 

List  of  the  1 02  genes  from  the  OpenArray  analysis  and  percentage  of  tumor  tissues  they  were  differentially 
expressed  in.  Negative  numbers  indicate  under-expression. 

FIGURE  LEGEND 

Figure  1.  Unsupervised  hierarchical  clustering  of  93  tissues  (24  invasive  cancers,  38 
ipsilateral  normal,  3  contralateral  normal,  28  normal  tissues  from  reduction  mamma- 
plasty)  from  64  patients  and  67  genes  that  discriminate  between  invasive  tissues  and 
mammaplasty  normal  tissues  (red  and  green  dots:  over-  and  under-expression  by 
PGR).  Columns:  tissues  form  two  distinct  clusters  (indicated  below  the  figure).  Rows: 
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genes  form  a  cancer  and  a  normal  cluster,  the  latter  being  divided  in  one  with  underex¬ 
pression  in  all  cancer  tissues  (left,  green  line)  and  one  with  mixed  expression  (orange- 
blue  line).  Luminal-like  and  basal-like  clusters  are  indicated  above  the  figure.  The  part  of 
the  heat  map  driving  the  luminal-like  cluster  is  boxed  (blue:  luminal-like  genes,  tur¬ 
quoise:  lobular  tissues).  Tissues  from  deceased  or  recurred  patients  have  a  black  or  or¬ 
ange  dot  above  the  tissue  descriptor  which  has  the  following  abbreviated  components: 
PatientNo  -  Diagnosis  (IDC=invasive  ductal  carcinoma,  ILC=invasive  lobular  carci¬ 
noma,  MET=metaplastic  carcinoma,  MUC=mucinous  carcinoma,  NML=normal)  -  Stage 
-  Grade  TissueNo  -  Description  (CA=cancer,  NM=normal  mammaplasty,  NI=normal  ip- 
silateral,  NC=normal  contralateral)  BI-RADS  Density  Subtype  (LUM=luminal,  BAS=b- 
asal  HER2).  The  tissue  descriptors  are  shaded  as  follows:  orange=lobular  cancers,  pin- 
k=other  cancers,  green=ipsilateral  normals,  blue=contralateral  normals,  purple=mamm- 
aplasty  normals.  Heat  map:  Red=up-regulation,  green=down-regulation,  grey=missing 
or  zero  value.  The  lines  below  the  heat  map  connect  tissues  from  the  same  patient. 
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Abstract 

Despite  the  prevalence  of  anemia  in  cancer,  recombinant 
erythropoietin  (Epo)  has  declined  in  use  because  of  recent 
Phase  III  trials  showing  more  rapid  cancer  progression  and 
reduced  survival  in  subjects  randomized  to  Epo.  Since  Epo 
receptor  (EpoR),  Jak2,  and  Hsp70  are  well-characterized 
mediators  of  Epo  signaling  in  erythroid  cells,  we  hypothe¬ 
sized  that  Epo  might  be  especially  harmful  in  patients 
whose  tumors  express  high  levels  of  these  effectors.  Because 
of  the  insensitivity  of  immunohistochemistry  for  detecting 
low  level  EpoR  protein,  we  developed  assays  to  measure 
levels  of  EpoR,  Jak2  and  Hsp70  mRNA  in  formalin-fixed 
paraffin-embedded  (EEPE)  tumors.  We  tested  23  archival 
breast  tumors  as  well  as  136  archival  head  and  neck  can¬ 
cers  from  ENHANCE,  a  Phase  III  trial  of  351  patients 
randomized  to  Epo  versus  placebo  concomitant  with  radio¬ 


therapy  following  complete  resection,  partial  resection,  or 
no  resection  of  tumor.  EpoR,  Jak2,  and  Hsp70  mRNA  levels 
varied  >30-fold,  >12-fold,  and  >13-fold  across  the  breast 
cancers,  and  >30-fold,  >40-fold,  and  >30-fold  across  the 
head  and  neck  cancers,  respectively.  Locoregional  progres¬ 
sion-free  survival  (LPES)  did  not  differ  among  patients 
whose  head  and  neck  cancers  expressed  above-  versus 
below-median  levels  of  EpoR,  Jak2  or  Hsp70,  except  in  the 
subgroup  of  patients  with  unresected  tumors  («  =  28), 
where  above-median  EpoR,  above-median  Jak2,  and  below- 
median  Hsp70  mRNA  levels  were  all  associated  with  signifi¬ 
cantly  poorer  LPES.  Our  results  provide  a  framework  for 
exploring  the  relationship  between  Epo,  cancer  progres¬ 
sion,  and  survival  using  archival  tumors  from  other  Phase 
III  clinical  trials.  Stem  Cells  2009;27:2353-2361 
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Introduction 


Anemia  is  common  in  cancer  patients  and  likely  represents  an 
independent  poor  prognostic  factor  for  survival  [1].  Safety 
concerns  associated  with  transfused  blood  elevated  erythro¬ 
poietin  (Epo)  to  a  mainstay  treatment  in  oncology.  However, 
recent  Phase  III  clinical  trials  testing  new  uses  for  Epo, 
including  targeting  higher  hemoglobin  levels  and  treating  ane¬ 
mia  not  caused  by  chemotherapy,  showed  that  Epo  reduced 
cancer  survival  times.  Venous  thromboembolism  is  a  well 
documented  risk  of  Epo  [2],  however  the  adverse  outcomes  in 
these  trials  were  attributed  mainly  to  accelerated  tumor  pro¬ 
gression  [3-7]. 

Whether  Epo  can  indeed  stimulate  cancer  progression  is 
the  subject  of  an  intense  controversy  [8],  and  preclinical  mod¬ 
els  have  generated  conflicting  results  (reviewed  by  Arcasoy 
[9]).  Central  to  the  controversy  is  whether  tumor  progression 
reflects  an  “off-target”  interaction  between  Epo  and  Epo-re- 


sponsive  tumor  cells  and/or  tumor  blood  vessels.  At  issue  is 
whether  tumors  (or  tumor  blood  vessels)  can  expropriate  sig¬ 
naling  pathways  known  to  confer  Epo  responsiveness  in  ery¬ 
throid  cells.  Epo  receptor  (EpoR)  mRNA  and  protein  are  de¬ 
tectable  In  tumor  cells,  albeit  at  levels  much  lower  than  in 
erythroid  cells  [10,  11].  Notably,  a  recent  study  showed  that  a 
neuroblastoma  cell  line  expressing  fewer  than  50  Epo  binding 
sites  per  cell  can  still  be  protected  from  apoptosis  in  response 
to  Epo  [12].  Thus,  the  pertinent  unanswered  question  is 
whether  even  low-level  expression  of  EpoR  or  other  effectors 
of  Epo-signaling  can  promote  cancer  progression  in  patients 
treated  with  Epo. 

A  direct  approach  to  examining  this  issue  would  be  to 
characterize  archival  tumor  specimens  from  patients  who  had 
enrolled  in  Phase  III  clinical  trials  of  Epo  versus  placebo, 
testing  whether  randomization  to  Epo  was  especially  harmful 
In  those  patients  whose  tumors  expressed  higher  levels  of  EpoR 
and/or  downstream  effectors  of  Epo  signaling.  A  previous  study 
employing  this  approach  characterized  154  archival  tumors 
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from  ENHANCE,  a  Phase  III  trial  of  351  patients  randomized 
to  Epo  versus  placebo  concomitant  with  radiotherapy  following 
complete  resection,  partial  resection,  or  no  resection  of  head 
and  neck  cancer  [13].  Tumors  were  evaluated  using  a  commer¬ 
cially  available  polyclonal  antibody  raised  against  a  human 
EpoR  peptide  (C20),  that  also  cross-reacts  with  non-EpoR  pro¬ 
teins,  including  heat  shock  protein  70  (Hsp70)  family  members 
[14].  A  significant  association  between  Epo  assignment  and 
reduced  EPFS  was  observed  among  patients  with  C20-positive 
tumors  (p  =  .003,  n  =  104)  that  was  not  observed  in  patients 
with  C20-negative  tumors.  However,  the  aforementioned  cross¬ 
reactivity  between  C20  and  non-EpoR  proteins  obscured  the 
interpretation  of  this  finding. 

Because  of  the  inadequacy  of  reagents  for  detecting  low- 
level  EpoR  protein  in  archival  tumors,  we  measured  mRNA. 
Most  clinical  tumor  specimens  are  formalin-fixed  and  paraf¬ 
fin-embedded  (FFPE),  causing  RNA  degradation.  We  there¬ 
fore  developed  methods  to  measure  mRNA  levels  of  EpoR 
and  16  other  genes  from  FFPE  tumors.  To  test  whether  the 
adverse  effects  of  Epo  might  be  mediated  by  increased 
expression  of  other  genes  implicated  in  Epo-responsiveness, 
we  included  Csf2rh,  Jak2,  and  Hsp70.  Csflrh  encodes  the 
common  beta  receptor  (/icR),  a  shared  signaling  subunit  for 
several  cytokine  receptors,  that  has  been  suggested  to  enhance 
Epo  signaling  in  nonerythroid  cells  [15].  Jak2  is  a  tyrosine 
kinase  that  is  an  essential  mediator  of  Epo  signaling  in  ery- 
throid  cells  [16],  facilitates  cell  surface  EpoR  expression  [17], 
and  is  also  implicated  in  Epo-mediated  neuroprotection  [18]. 
Hsp70  family  members  are  encoded  by  eight  Hspa  genes,  per¬ 
form  essential  roles  in  protein  folding,  transport,  and  degrada¬ 
tion  [19],  and  promote  cancer  cell  survival  [20].  Hspala  and 
Hspalh  encode  proteins  with  one  amino  acid  difference,  col¬ 
lectively  referred  to  as  the  major  stress  inducible  Hsp70.  In 
differentiating  erythroid  cells,  Hsp70  accumulates  in  the 
nucleus  in  response  to  Epo,  where  it  shields  the  transcription 
factor  Gata-1  from  caspase-3-mediated  degradation  [21]. 
Additional  markers  were  included  to  test  whether  the  adverse 
effects  of  Epo  might  depend  on  vascular  endothelial  cell  repre¬ 
sentation  (CdhS,  Pecaml,  Vegfa),  tumor  squamous  epithelial 
cell  representation  (KrtS)  [22],  or  cancer  stem  cells  (Cd44)  [23], 
since  a  recent  study  suggested  that  Epo  may  increase  the  self¬ 
renewal  capacity  of  CD44^  breast  cancer-initiating  cells  [24]. 
We  also  measured  transcripts  for  Epo  itself,  and  seven  control 
genes  for  normalization  (see  below).  Our  results  provide  a 
framework  for  investigating  Epo-induced  tumor  progression. 


Methods 


Cell  Lines 

All  cancer  cell  lines  have  been  previously  described.  To  prepare 
Ba/F3-h£pol?  cells,  Ba/F3  cells  [25]  were  electroporated  with 
pcDNA3.1-h£po/?  encoding  a  human  EpoR  cDNA  (a  gift  from 
Joseph  Prchal,  University  of  Utah),  selected  in  1  mg/ml  Geneticin 
(Invitrogen,  Carlsbad,  CA,  http://www.invitrogen.com),  and  main¬ 
tained  in  lU/ml  epoetin  alfa  (Proerit,  Ortho  Biotech,  Bridgewater, 
NJ,  http://www.orthobiotech.com).  COS-hEpoR  cells  were  pre¬ 
pared  by  transfecting  COS  cells  with  pcDNA3.1-h£'poJ?  using 
Lipofectamine  2000  (Invitrogen)  and  were  collected  48  hours  af¬ 
ter  transfection.  AT-2  cells  were  provided  by  Janet  Rowley  (Uni¬ 
versity  of  Chicago)  and  ASE2  cells  were  provided  by  Chugai 
Pharmaceuticals  (Japan). 

Immunohistochemistry 

FFPE  cell  pellets  were  sectioned  (6  micron)  and  slides  were  depar- 
affinized  and  rehydrated  through  a  graded  ethanol  series.  Endoge¬ 


nous  peroxidase  activity  was  blocked  using  0.3%  hydrogen  perox¬ 
ide  for  8  minutes,  and  endogenous  biotin  sites  were  blocked  using 
the  Avidin/Biotin  Blocking  Kit  (Dako,  Glostrup,  Denmark,  http:// 
www.dako.com).  Sections  were  then  incubated  with  a  polyclonal 
goat  anti-EpoR  antibody  (abl0653.  Abeam,  Cambridge,  MA, 
http://www.abcam.com)  for  60  minutes.  Primary  antibodies  were 
detected  using  a  biotinylated  antigoat  secondary  antibody  (Jackson 
ImmunoResearch,  West  Grove,  PA,  http://www.jacksonimmuno. 
com)  for  30  minutes  followed  by  visualization  using  the  Vector 
Elite  ABC  system  (Vector  Laboratories,  Burlingame,  CA,  http:// 
www.vectorlabs.com).  Staining  was  visualized  with  3, 3' -diamino- 
benzidine  for  7  minutes,  and  the  sections  were  counterstained  with 
hematoxylin  for  2  minutes.  Concentration  matched  isotype  controls 
(Jackson  ImmunoResearch)  were  run  for  each  cell  sample. 

Flow  Cytometric  Detection  of  Cell  Surface  EpoR 

Adherent  cell  lines  were  lifted  for  15  minutes  using  0.02%  ethyl- 
enediaminetetraacetic  acid  in  phosphate  buffered  saline  (PBS), 
washed  with  PBS,  and  filtered  through  a  70  pm  strainer.  Cells 
were  blocked  for  15  minutes  at  room  temperature  in  fluores¬ 
cence-activated  cell  sorting  (FACS)  buffer  (PBS,  0.1%  bovine 
serum  albumin  (BSA),  0.02%  sodium  azide)  containing  250  pg! 
ml  human  immunoglobulin  G  (IgG)  (Sigma-Aldrich,  St.  Louis, 
MO,  http://www.sigmaaldrich.com).  A  murine  monoclonal  antihu¬ 
man  EpoR-phycoerythrin  (PE)  antibody  (FAB307P,  R&D  Sys¬ 
tems,  Minneapolis,  MN,  http://www.mdsystems.com)  was  then 
added  to  5  pglmX  and  cells  were  incubated  for  30  minutes  on  ice. 
Cells  were  also  stained  with  two  different  murine  IgG2b-PE  iso¬ 
type  control  antibodies  (IC0041P,  R&D  Systems  or  555058,  BD 
Biosciences,  Franklin  Lakes,  NJ,  http://www.bdbiosciences.com). 
After  staining,  cells  were  washed,  resuspended  in  FACS  buffer, 
and  analyzed  by  flow  cytometry  (FACS-Canto,  BD  Bioscienees). 
Dead  cells  were  excluded  from  analyses  of  adherent  cells  by 
inclusion  of  3.75  pglmX  7-aminoactinomycin  D.  For  determination 
of  EpoR  staining  relative  to  each  isotype  control,  the  mean  fluo¬ 
rescence  value  obtained  for  three  replicate  isotype  control  stain¬ 
ing  reactions  was  subtracted  from  the  mean  fluorescence  value 
obtained  for  three  replicate  anti-EpoR  staining  reactions. 

Signal  Transducer  and  Activator  of  Transcription 
5  Phosphorylation 

REH  (acute  lymphoblastic  leukemia)  and  U266  (myeloma)  cells 
were  washed  twice,  starved  for  5  hours  in  Roswell  Park  Memo¬ 
rial  Institute  media  (RPMI),  0.5%  BSA,  and  stimulated  for  15 
minutes  with  either  10  U/ml  epoetin  alfa  (Proerit,  Ortho  Biotech), 
or  vehicle  (Proerit  buffer:  2.5  mg/ml  human  albumin,  1.3  mg/ml 
sodium  citrate,  8.2  mg/ml  sodium  chloride,  0.11  mg/ml  citric 
acid,  1%  benzyl  aleohol).  For  Epo  antagonist  control  reactions,  a 
225  amino  acid  recombinant  soluble  human  EpoR  extracellular 
domain  (R&D  Systems)  was  added  to  2.25  pglmX.  Cells  were 
then  fixed  in  2%  paraformaldehyde  (10  minutes,  37°C),  washed, 
permeabilized  with  90%  methanol  in  PBS  (30  minutes,  4°C), 
washed,  resuspended  in  FACS  buffer,  and  stained  for  20  minutes 
at  room  temperature  with  an  Alexa  Fluor  647  anti-phospho-  sig¬ 
nal  transducer  and  activator  of  transcription  5  (STAT5)  phospho- 
tyrosine  464  (PY464)  antibody  (1:5  dilution,  612599,  BD  Bio¬ 
sciences).  Cells  were  then  washed,  resuspended  in  FACS  buffer, 
and  analyzed  by  flow  cytometry  (FACS-Canto,  BD  Biosciences). 

Tumor  Samples 

Permission  was  obtained  from  the  University  of  Washington 
Institutional  Review  Board  to  study  primary  tumors.  Breast 
tumors  were  from  an  established  repository  (Department  of 
Defense  grant  DAMD  17-02-1-0691)  that  stores  tissues  donated 
by  women  undergoing  surgery  for  breast  cancer  (invasive  cancer 
or  in  situ  disease)  as  FFPE  tissue  and  as  snap-frozen  tissue.  FFPE 
head  and  neck  tumors  were  obtained  from  the  local  Pathology 
Department  and  from  ENHANCE,  a  previously  reported  [3]  mul¬ 
ticenter  Phase  III  trial  of  epoetin  beta  in  351  patients  receiving 
radiotherapy  for  head  and  neck  cancer.  All  ENHANCE  samples 
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were  among  154  tumors  previously  examined  using  the  polyclo¬ 
nal  C20  anti-EpoR  antibody  (Santa  Cmz  Biotechnology,  Santa 
Cruz,  CA,  http://www.scbt.com)  [13].  One  hundred  thirty-six  of 
the  154  tumors  were  sent  as  four  micron  FFPE  sections  on  dei- 
dentified,  coded  glass  slides  to  the  University  of  Washington  for 
mRNA  analysis  by  an  investigator  blinded  to  clinical  outcomes. 

ENHANCE  Trial  Design 

Patient  selection,  treatment,  follow-up,  evaluation,  and  baseline 
characteristics  were  described  previously  [3,  13].  Briefly,  the 
main  inclusion  criteria  were  squamous  cell  carcinomas  of  the 
head  and  neck,  scheduled  definitive  or  postoperative  radiotherapy, 
and  a  decreased  blood  hemoglobin  (<13g/dl,  men;  <12g/dl, 
women)  at  randomization.  Patients  were  randomly  assigned  to 
300  lU/kg  epoetin  beta  or  placebo  three  times  per  week  starting 
10  to  14  days  before  radiotherapy,  continuing  throughout.  Prior 
to  randomization,  patients  were  stratified  by  resection  status:  1) 
complete  resection;  2)  incomplete  resection;  or  3)  unresected  dis¬ 
ease.  Iron  (III)  saccharate  (200  mg)  was  administered  intrave¬ 
nously  once  weekly  to  patients  with  <25%  transfemn  saturation. 
Epoetin  beta  was  stopped  if  hemoglobin  increased  more  than  2  g/ 
dl  within  1  week  or  when  targets  were  reached  (>15  g/dl,  men; 
>14  g/dl,  women)  and  resumed  when  hemoglobin  fell  below  tar¬ 
get.  Locoregional  cancer  control  and  survival  was  assessed  at 
3-month  intervals  by  an  independent  oncologist  blinded  to  treat¬ 
ment  assignment.  The  primary  endpoint  was  locoregional  progres¬ 
sion-free  survival  (LPFS).  Locoregional  progression  was  noted 
if  the  tumor  recurred  or  increased  by  25%.  Baseline  serum  Epo 
levels  were  determined  prior  to  treatment. 

Quantitative  Reverse  Transcriptase  Polymerase 
Chain  Reaction 

RNA  was  extracted  from  cancer  cell  lines  using  the  RNeasy  Mini 
Kit  (Qiagen,  Valencia,  CA,  http://www.qiagen.com)  and  from 
FFPE  tumor  sections  using  the  Absolutely  RNA  FFPE  kit  (Strata- 
gene.  La  lolla,  CA,  http://www.stratagene.com).  On-column  Dna- 
sel  digestion  was  performed  to  remove  genomic  DNA.  First 
strand  cDNA  was  synthesized  with  random  hexamer  primers  and 
Superscript  III  reverse  transcriptase  (RT)  (Invitrogen),  the  latter 
omitted  for  no-RT  control  reactions.  Next,  cDNA  targets  were 
amplified  using  Taqman  probes  and  a  7900HT  thermal  cycler 
(Applied  Biosystems,  ABI,  Foster  City,  CA,  http://www.applied- 
biosystems.com).  With  the  exception  of  certain  intronless  mem¬ 
bers  of  the  HspVO  family  and  the  candidate  reference  gene  18s, 
all  probes  recognized  exon  junctions  to  prevent  genomic  DNA 
amplification  (supporting  information  Table  1).  Cycle  threshold 
(Ct)  values  were  determined  with  the  Sequence  Detection  Soft¬ 
ware  (ABI).  A  coefficient  of  variance  <4%  for  triplicate  Ct  deter¬ 
minations  was  considered  acceptable.  Where  indicated  in  the  text, 
preamplification  of  cDNA  was  performed  with  the  Taqman  pre¬ 
amplification  multiplex  system  (ABI).  Preamplification  uniformity 
(lack  of  bias)  for  each  Taqman  probe  was  tested  by  calculating 
ACt  values  for  data  obtained  with  both  unamplified  and  preampli¬ 
fied  cDNA,  where  ACt  =  mean  Ct  for  target  gene  —  mean  Ct  for 
reference  gene.  This  was  performed  using  several  ENHANCE 
samples  which  contained  sufficient  RNA,  erythroid  ASE2  cells 
[26],  and  a  universal  human  total  RNA  standard  (Stratagene). 
Uniformity  comparisons  (ACt  preamplified  —  ACt  unamplified) 
were  considered  acceptable  to  a  tolerance  of  variation  of  1.5 
cycles  per  manufacturer’s  instructions  (ABI).  Relative  quantifica¬ 
tion  was  determined  using  the  comparative  Ct  method,  2“^*^^ 
where  ACt  =  mean  Ct  for  target  gene  —  mean  Ct  for  reference 
gene.  Reference  gene  stability  was  evaluated  using  the  Genorm 
algorithm  [27]. 

Statistical  Analysis 

The  number  of  patients  included  in  the  analysis  of  LPFS  for  each 
marker  depended  upon  the  number  of  samples  yielding  sufficient 
RNA  for  quantitative  reverse  transcriptase  polymerase  chain  reac¬ 
tion  (RT-PCR).  This  varied  for  each  marker  as  a  function  of  the 


expression  level  of  that  gene.  Our  analyses  included  all  available 
data  for  each  marker.  For  LPFS  analyses,  patients  were  stratified 
into  above-median  or  below/equal-median  mRNA  expression  lev¬ 
els.  This  stratification  was  done  separately  for  the  total  population 
and  within  each  resection  stratum  for  every  gene.  LPFS  was  eval¬ 
uated  with  the  Kaplan-Meier  survival  estimation.  The  log-rank 
test  was  implemented  to  test  the  null  hypothesis  that  the  distribu¬ 
tion  of  survival  times  between  patients  treated  with  Epo  versus 
placebo  was  equal.  The  STATA  statistical  software  package  was 
used  for  all  analyses  (version  10.0,  Stata  Corporation,  College 
Station,  TX,  http://www.stata.com).  Statistical  tests  were  two- 
sided  and  considered  significant  at  p  <  0.05.  For  patients  in  the 
placebo  group  stratified  by  endogenous  serum  Epo,  <  1 1  U/1  was 
defined  as  low,  whereas  >  1 1  U/1  was  defined  as  high,  based  on  a 
previous  study  [28].  Spearman’s  correlation  coefficients  were  calcu¬ 
lated  for  C20  staining  status  (measured  as  a  dichotomous  variable) 
versus  EpoR  or  Hsp70  mRNA  levels  (measured  as  a  continuous 
variable).  In  cell  line  studies,  Spearman’s  correlation  coefficients 
were  calculated  fox  EpoR  mRNA  versus  surface  protein  levels. 


Results 


EpoR  mRNA  and  Surface  Protein  Levels  in  Cancer 
Cell  Lines 

To  characterize  the  relation  between  EpoR  mRNA  and  cell 
surface  protein,  we  tested  32  human  cell  lines  including  three 
high  EpoR-expressing  positive  control  cell  lines:  UT7EPO, 
ASE2,  and  OCIMl.  Eor  normalization  of  mRNA  levels,  we 
used  the  three  most  stable  reference  genes  {Hmhs,  Hprtl, 
RplpO)  among  a  panel  of  seven  candidates  evaluated  across 
all  cell  lines  as  determined  by  the  Genorm  algorithm  (support¬ 
ing  information  Table  2).  EpoR  mRNA  levels  among  the  non¬ 
control  lines  ranged  from  0.5  to  7.5%  (mean  2.0%)  of  the 
level  in  UT7EPO  cells  (Eig.  lA).  Eor  flow  cytometry,  we 
compared  the  fluorescent  intensity  of  cells  stained  with  a 
monoclonal  antibody  directed  against  EpoR  with  the  same 
cells  stained  using  two  different  isotype  control  antibodies. 
Surface  EpoR  levels  among  the  noncontrol  lines  ranged  from 
1.2  to  25.2%  (mean  8.3%)  of  the  level  in  UT7EPO  cells 
(Fig.  IB).  Among  all  cell  lines  tested  there  was  a  significant 
correlation  between  mRNA  and  surface  protein  (r  =  .33,  p  = 
.03,  n  =  32).  When  positive-control  UT7EPO,  ASE2,  and 
OCIMl  cells  were  excluded,  the  significance  of  this  correla¬ 
tion  was  maintained  among  nonadherent  cells  (r  =  .58,  p  = 
.03,  n  =  11,  Fig.  1C)  but  no  correlation  was  observed  when 
our  analysis  was  restricted  to  the  adherent  cell  lines  (r  = 
—  .19,  p  =  .23,  n  =  18,  Fig.  ID).  Of  note,  our  analysis  of  ad¬ 
herent  cell  lines  («  =  18)  required  additional  processing  steps 
to  generate  single  cell  suspensions  (see  Methods)  that  were 
associated  with  significant  cell  death  and  debris  (not  shown). 
Moreover,  16  of  18  adherent  cell  lines  produced  discordant 
staining  patterns  for  the  two  different  isotype  control  antibod¬ 
ies  (that  is,  average  deviation  in  fluorescence  values  obtained 
for  anti-EpoR  antibody  staining  relative  to  fluorescence  values 
obtained  for  two  isotype  control  antibodies  exceeded  5%  of 
the  mean),  whereas  only  5  of  11  nonadherent  cell  lines 
showed  this  discrepancy  {p  <  .01)  (Fig.  IB).  The  apparent 
lack  of  correlation  between  EpoR  mRNA  and  surface  protein 
among  the  adherent  cell  lines  likely  results  from  these  techni¬ 
cal  limitations  that  preclude  accurately  estimating  levels  of 
EpoR  on  the  cell  surface,  but  may  also  be  influenced  by  post- 
transcriptional  regulation  of  EpoR  in  these  lines.  Analysis  of 
two  of  the  EpoR-expressing  nonerythroid  lines  (U266  and 
REH)  demonstrated  Epo-dependent  STAT5  phosphorylation 
(supporting  information  Fig.  1),  consistent  with  previous 
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Figure  1.  EpoR  mRNA  and  surface  pro¬ 
tein  levels  in  cancer  cell  lines.  (A):  EpoR 
mRNA  levels  were  determined  by  quantita¬ 
tive  reverse  transcriptase  polymerase  chain 
reaction.  The  mRNA  level  of  each  cancer 
cell  line  is  plotted  relative  to  the  level  in 
control  UT7EPO  cells.  Error  bars  represent 
the  standard  deviation  of  results  obtained 
upon  normalization  to  Hmhs,  RplpO,  and 
Hprtl.  (B):  Differences  between  the  mean 
fluorescent  intensities  of  cells  stained  with 
a  phycoerythrin  (PE)-conjugated  monoclo¬ 
nal  anti-EpoR  antibody  versus  each  of  two 
different  PE-conjugated  murine  IgG2b-PE 
isotype  controls  are  depicted.  Results  are 
plotted  relative  to  UT7EPO  cells.  Error 
bars  depict  standard  deviations  of  the  dif¬ 
ferences.  (C):  The  rank  order  of  mRNA 
and  protein  expression  is  plotted  for  all 
nonadherent  cell  lines,  excluding  the  posi¬ 
tive  control  lines  UT7EPO,  ASE2,  and 
OCIMl.  (D):  The  rank  order  of  mRNA  and 
protein  expression  is  plotted  for  all  adher¬ 
ent  cell  lines.  Spearman’s  rank  order  corre¬ 
lation  coefficients  are  indexed  above  the 
graphs  in  (C)  and  (D). 


reports  of  Epo-dependent  signal  transduction  in  nonerythroid 
cells  [9,  29]. 

Development  of  a  Quantitative  RT-PCR  Assay  for 
EpoR  mRNA  in  Archival  Tumor  Samples 

Most  tumors  from  clinical  trials  are  preserved  as  FEPE  tissue. 
We  found  that  immunohistochemistry  with  a  specific  antibody 
was  not  sufficiently  sensitive  to  detect  low-level  EpoR  protein 


in  FFPE  tumor  cell  lines  (supporting  information  Fig.  2).  We 
therefore  tested  whether  EpoR  mRNA  could  be  accurately 
measured  in  FFPE  tumors  despite  the  RNA  degradation  that 
accompanies  FFPE-processing  [30].  Three  independent  assess¬ 
ments  of  EpoR  mRNA  levels  from  serial  sections  of  11  FEPE 
breast  tumors  demonstrated  that  our  measurements  were 
highly  reproducible,  and  that  EpoR  mRNA  levels  varied  as 
much  as  34-fold  (supporting  information  Fig.  3).  To  assess 
the  validity  of  mRNA  measurements  from  FFPE  primary 
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Figure  2.  Analysis  of  concordance  in 
mRNA  measurements  between  snap  frozen 
and  formalin-fixed  paraffin-embedded 
(FFPE)  breast  tumors.  Twenty-three  breast 
tumors  are  ranked  for  their  level  of  mRNA 
expression  of  the  indicated  genes  (normal¬ 
ized  to  Hmbs  expression).  Results  using 
RNA  extracted  from  snap  frozen  (y-axis) 
versus  FFPE  (x-axis)  pieces  of  the  same 
breast  tumor  are  shown.  Spearman’s  rank 
order  coiTelation  coefficients  are  indexed 
above  each  graph. 


tumors,  we  compared  expression  levels  for  EpoR  using  23 
breast  tumors  which  were  divided  and  processed  both  as 
FFPE  and  snap-frozen  tissue.  Because  the  FFPE  and  snap-fro¬ 
zen  samples  represent  different  pieces  of  the  same  tumor,  and 
snap-freezing  preserves  a  higher  degree  of  RNA  integrity,  this 
comparison  allowed  us  to  simultaneously  assess  whether  RNA 
degradation  influences  the  accuracy  of  our  measurements  as 
well  as  the  uniformity  with  which  EpoR  is  expressed  across 
tumors.  We  also  measured  mRNA  levels  of  Jak2  and  Hsp70, 
which  participate  in  Epo  signaling  in  erythroid  cells  [16,  21], 
Csflrh,  which  has  been  suggested  to  enhance  Epo  signaling 
in  nonerythroid  cells  [15],  endothelial-associated  genes  (Cdh5, 
Pecaml .  Vegfa),  the  squamous  epithelial  marker  Krt5  (espe¬ 
cially  relevant  for  head  and  neck  cancer)  [22],  the  putative 
cancer  stem  cell  marker  Cd44  [23],  and  Epo  itself.  Significant 
correlations  between  EFPE  and  snap-frozen  mRNA  measure¬ 
ments  were  observed  for  EpoR,  Csflrh,  Juki,  Hsp70,  Cd44, 
KrtS  and  Esrl  (estrogen  receptor- 1,  used  as  a  positive  control) 
(Eig.  2).  These  findings  suggest  that  single  tumor  sections  can 
he  used  to  gauge  the  overall  expression  levels  of  these 
markers.  Expression  levels  varied  over  a  wide  range  for  each 
these  genes  (Fig.  3A).  In  contrast,  Vegfa,  CdhS,  and  Pecaml 
were  not  significantly  correlated,  consistent  with  regional 
heterogeneity  in  tumor  vascularity  [31]  whereas  Epo  was 
detected  in  too  few  FFPE  tumors  to  permit  calculation  of  a 
correlation  coefficient. 

Assessing  EpoR  mRNA  Levels  in  Head  and  Neck 
Cancers  from  ENHANCE 

We  assayed  EpoR  mRNA  levels  in  136  archival  FEPE  head 
and  neck  tumors  from  ENFIANCE,  a  subset  of  the  154  eval¬ 


uated  previously  hy  immunohistochemistry  using  the  C20 
antibody  [13].  Since  most  samples  consisted  of  only  a  single 
microscope  slide  with  minimal  tissue,  we  employed  a  non- 
biased  target-specific  cDNA  preamplification  method  for  all 
genes  (supporting  information  Table  3).  We  included  7  candi¬ 
date  reference  genes  for  normalization  {Hprtl,  Ppia,  Ipo8, 
Hmhs,  Gapdh,  Tfrc,  and  RplpO).  These  reference  genes  were 
included  based  on  their  stability  among  16  candidates  tested 
by  the  Genorm  algorithm  [27]  in  a  panel  of  8  hreast  cancers 
and  8  head  and  neck  cancers  (supporting  information  Table 
2).  Results  for  Hprtl  were  excluded  because  of  high  or  no  Ct 
values  in  many  samples.  For  tumor  samples  with  sufficient 
RNA  (representing  123  different  tumors),  there  were  strong 
positive  correlations  in  Ct  values  among  all  reference  genes 
(r  >0.88  for  all  pairwise  comparisons,  p  <  .001). 

We  tested  normalization  of  EpoR  values  with  each  of  the 
reference  genes  and  assessed  the  extent  to  which  relative  quan¬ 
tification  values  might  he  influenced  by  RNA  abundance/integ¬ 
rity.  Specifically,  a  phenomenon  was  reported  hy  Cronin  et  al. 
in  which  greater  age  of  FFPE  blocks  (and  the  lower  RNA  abun¬ 
dance/integrity)  was  associated  with  higher  relative  quantifica¬ 
tion  values  even  after  normalization  [32].  This  effect  was  attrib¬ 
uted  to  differential  degradation  of  target  versus  endogenous 
control  gene  transcripts  and  was  reduced  by  minimizing  the 
size  and  range  of  target  and  control  gene  assay  amplicon  sizes. 
In  our  data  set,  higher  reference  gene  Ct  values  (less  RNA 
ahundance/integrity)  were  indeed  associated  with  higher  rela¬ 
tive  EpoR  quantification  upon  normalization  to  Ppia  levels  as 
evidenced  by  the  strong  positive  correlation  between  Ppia  Ct 
values  and  normalized  EpoR  relative  quantification  values  (sup¬ 
porting  information  Eig.  4A).  Similar  results  were  obtained  for 
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Figure  3.  Range  of  mRNA  levels  in  primary  tumors.  Relative  quantification  values  are  shown  as  fold  differences,  with  the  value  from  the  low¬ 
est-expressing  tumor  assigned  a  value  of  1.  Numbers  at  the  bottom  of  each  graph  indicate  the  number  of  tumors  for  which  data  were  obtained. 
(A):  Results  for  breast  tumors.  (B):  Results  for  head  and  neck  tumors  from  ENHANCE. 


normalization  with  Gapdh,  RplpO,  lpo8  or  Tfrc.  Whereas  a  sim¬ 
ilar  pattern  of  normalized  EpoR  expression  within  subgroups  of 
tumor  samples  with  similar  amounts  of  RNA  abundance/integ¬ 
rity  was  observed,  the  overall  systematic  effect  of  RNA  abun¬ 
dance/integrity  on  normalization  would  have  precluded  com¬ 
parisons  among  all  patients.  Consistent  with  Cronin  et  al.,  this 
systematic  effect  was  alleviated  upon  normalization  to  Hmhs, 
which  had  the  shortest  amplicon  size  among  all  reference  gene 
assays  tested  (64  bp)  (supporting  information  Fig.  4B).  Five 
samples  with  low  RNA  abundance/integrity  produced  relative 
quantification  values  greater  than  mean  +  1  standard  deviation 
and  were  omitted.  For  30  tumors,  EpoR  mRNA  levels  could  not 
be  determined  relative  to  other  samples  because  of  undeter¬ 
mined  endogenous  control  gene  (Hmhs)  and/or  EpoR  Ct  values. 
Among  the  remaining  101  tumors,  we  observed  a  >30-fold 
range  of  EpoR  mRNA,  and  expression  levels  also  varied  widely 
for  the  other  genes  examined  (Fig.  3B). 

mRNA  Levels  and  Locoregionai 
Progression-Free  Survival 

We  evaluated  LPFS  within  each  resection  stratum  for  patients 
with  tumors  expressing  above-  versus  below-median  levels  of 
each  marker  (determined  separately  for  each  stratum).  Signifi¬ 
cant  associations  between  transcript  level,  Epo  treatment  and 
adverse  outcome  were  observed  only  in  the  no  resection  stra¬ 
tum  (n  =  28)  (Table  1).  Significantly  poorer  LPFS  was 
observed  for  Epo-treated  subjects  with  above-median  but  not 
below-median  levels  of  EpoR  (Fig.  4A)  or  Jak2  (Fig.  4B) 
(EpoR:  above-median  p  =  .02,  n  =  14,  below-median  p  =  .8, 
n  =  14;  Jak2:  above-median  p  =  .04,  n  =  17,  below-median  p 
=  .34,  n  =  18).  In  addition,  we  found  a  significant  association 
between  Epo  treatment,  poor  outcome,  and  below-median  but 
not  above-median  levels  of  Hsp70  family  members  in  aggre¬ 
gate  (Fig.  4C)  (Hsp70  below-median  p  =  .01,  «  =  20,  above¬ 
median  p  =  .38,  n  =  19)  and  individually  (supporting  informa¬ 


tion  Table  4).  The  significance  of  these  associations  was  not 
further  increased  by  dichotomizing  mRNA  at  higher  thresholds 
(that  is,  highest  10%  vs.  the  rest)  (not  shown).  Combinations 
of  above-median  EpoR,  above-median  Jak2,  and  below-me¬ 
dian  Hsp70  did  not  increase  the  significance  of  the  association 
between  treatment  assignment  (Epo  versus  placebo)  and  LPFS 
compared  to  each  marker  individually  (not  shown). 

We  also  compared  Epo-treated  patients  with  above-median 
EpoR  expression  to  Epo-treated  patients  with  below-median 
expression  in  the  no  resection  stratum  (see  bracket  below  the 
graphs  in  Fig.  4A).  A  trend  toward  worse  LPFS  in  Epo-treated 
patients  with  above-median  levels  of  tumor  EpoR  mRNA  was 
not  significantly  different  from  patients  whose  tumors 
expressed  below-median  EpoR  mRNA  (p=.13,  n=ll).  Anal¬ 
ogous  comparisons  for  Jak2  and  Hsp70  mRNA  levels  also 
showed  no  significant  differences  among  Epo-treated  patients. 

Relationship  to  Endogenous  Erythropoietin  Levels 

If  exogenous  Epo  can  stimulate  tumor  progression,  endoge¬ 
nous  Epo  might  also  stimulate  tumor  progression.  The 
ENHANCE  study  documented  single-time-point  pretreatment 
serum  Epo  levels  [3],  and  we  obtained  these  results  for  147  of 
the  154  patients  reported  previously  [13]  from  the  trial  spon¬ 
sor.  Confining  analyses  to  subjects  enrolled  in  the  placebo 
group,  we  did  not  find  an  association  between  LPFS  and  base¬ 
line  hemoglobin  level  or  LPFS  and  baseline  serum  Epo  level 
(not  shown).  Additionally,  baseline  hemoglobin  levels  and 
serum  Epo  levels  did  not  correlate  (not  shown).  Finally,  we 
tested  whether  elevated  endogenous  Epo  levels  were  associ¬ 
ated  with  LPFS  in  subjects  with  above-median  versus  below- 
median  levels  of  EpoR,  Jak2  or  Hsp70  mRNA.  Because  of  the 
small  number  of  patients  available,  we  combined  the  incom¬ 
plete  and  no  resection  strata  into  a  new  category  called  “resid¬ 
ual  tumor.”  However,  there  was  no  association  between 
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Table  1.  Analysis  of  exogenous  erythropoietin  administration  and  locoregional  progression-free  survival  by  mRNA  marker  status^ 

Below  Median'^  Marker  Value 

Above  Median*’  Marker  Value 

mRNA 

Number  of  Patients 

Log  Rank 

Number  of  Patients 

Log  Rank 

Stratum 

Epo 

Placebo 

p  Value‘s 

Epo 

Placebo 

p  Value'’ 

EpoR 

All  patients 

24 

27 

0.22 

23 

27 

0.31 

Complete 

11 

13 

0.90 

14 

10 

0.47 

Incomplete 

5 

8 

0.17 

6 

6 

0.82 

No  resection 

7 

7 

0.80 

4“ 

10“ 

0.02“ 

Jak2 

All  patients 

29 

29 

0.65 

26 

31 

0.15 

Complete 

14 

14 

0.12 

14 

13 

0.15 

Incomplete 

7 

6 

0.33 

4 

8 

0.71 

No  resection 

8 

10 

0.34 

8“ 

9“ 

0.04“ 

Hsp7(f 

All  Patients 

31 

31 

0.98 

29 

32 

0.40 

Complete 

15 

15 

0.26 

15 

14 

0.74 

Incomplete 

6 

7 

0.79 

5 

7 

0.16 

No  resection 

9" 

11“ 

0.01“ 

10 

9 

0.38 

Csfirh 

All  patients 

24 

29 

0.55 

26 

27 

0.83 

Complete 

9 

17 

0.57 

17 

8 

0.34 

Incomplete 

5 

8 

0.78 

6 

6 

0.18 

No  resection 

9 

6 

0.35 

4 

11 

0.26 

Cd44 

All  Patients 

30 

32 

0.73 

30 

31 

0.50 

Complete 

16 

14 

0.70 

14 

15 

0.96 

Incomplete 

5 

8 

0.52 

6 

6 

0.89 

No  resection 

9 

11 

0.12 

10 

9 

0.10 

KrtS 

All  Patients 

30 

27 

0.84 

27 

30 

0.28 

Complete 

17 

10 

0.86 

10 

16 

0.61 

Incomplete 

4 

9 

0.89 

7 

5 

0.65 

No  resection 

10 

8 

0.26 

9 

9 

0.08 

^Stratification  was  above  versus  below/equal  to  the  median. 

^The  median  was  calculated  separately  for  all  patients  and  within  each  resection  stratum. 

“^The  p  value  is  two  sided  and  is  based  on  the  log  rank  test  to  compare  differences  in  Kaplan  Meier  distributions  in  response  to  Epo 

versus 

placebo. 

^Groups  with  significant  adverse  effects  of  Epo. 

^HspVO  mRNA  represents  the  cumulative  expression  of  all  8  family  members.  Results  for  individual  family  members  are  presented  in 

Supporting  information  Table  4. 

endogenous  Epo  level  and  LPFS  based  on  tumor  EpoR,  Jak2 
or  Hsp70  mRNA  levels  (supporting  information  Table  5). 

Correlations  with  C20  Staining 

The  tumors  we  evaluated  were  among  the  154  previously 
characterized  using  the  C20  antibody  [13],  which  was  raised 
against  a  human  EpoR  sequence  hut  cross-reacts  with  other 
proteins,  including  HspYO  family  members  [14].  Notably,  we 
found  no  significant  correlation  between  C20  status  and  EpoR 
mRNA  (r  =  —0.11,  p  =  .26,  n  =  100),  or  Hsp70  family 
member  mRNA  in  aggregate  (r  =  0.06,  p  =  .54,  n  =  122)  or 
individually  (supporting  information  Fig.  5). 


Discussion 


As  one  of  the  most  prominent  drugs  in  oncology,  the  unex¬ 
pected  association  between  Epo  and  increased  cancer  death 
rates  has  created  concern  and  uncertainty.  One  of  the  central 
questions  is  whether  Epo  induced  “off-target”  signaling  in 
tumors  or  tumor  blood  vessels  can  hasten  cancer  progression. 
Preclinical  models  may  bring  insight  to  this  issue,  however 
definitive  answers  can  only  come  from  studies  in  humans. 


The  substantial  challenge  presented  by  the  very  low  level  of 
EpoR  present  in  nonerythroid  cells  is  counter-balanced  in  part 
by  an  extensive  body  of  literature  surrounding  Epo  signaling 
in  erythroid  cells.  Using  this  knowledge,  we  developed  meth¬ 
ods  to  characterize  human  tumors  for  their  potential  compe¬ 
tency  to  respond  to  Epo.  Since  existing  reagents  for  detecting 
EpoR  protein  in  tumor  sections  are  insufficiently  sensitive  and 
specific  [14],  we  measured  mRNA.  Laying  the  foundation  for 
this  effort,  we  show  that  EpoR  mRNA  levels  can  be  estimated 
despite  the  extensive  RNA  degradation  that  accompanies 
FFPE  processing  and,  barring  results  of  adherent  cell  lines 
that  are  difficult  to  accurately  assess  by  flow  cytometry,  we 
show  that  EpoR  mRNA  levels  appear  to  reasonably  estimate 
levels  of  EpoR  cell  surface  protein. 

We  found  a  >30-fold  range  of  EpoR  mRNA  across  a  se¬ 
ries  of  breast  cancers  and  head  and  neck  cancers.  This  finding 
does  not  necessarily  contradict  a  previous  study  documenting 
the  lack  of  significant  differences  in  EpoR  mRNA  levels 
between  tumors  and  normal  tissues  [33].  A  preferential  sus¬ 
ceptibility  to  Epo-induced  signaling  in  malignant  versus  nor¬ 
mal  tissue  is  not  a  prerequisite  for  a  direct  effect  of  Epo  on 
tumors,  for  example,  a  heritable  basis  for  variation  in  EpoR 
expression  levels  has  been  proposed  in  swine  [34].  Measure¬ 
ments  of  total  tumor  EpoR  mRNA  levels  cannot  distinguish 
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A  —  Epo  —  Placebo 

Below  Median  EpoR  RQ  Above  Median  EpoR  RQ 


B  —  Epo  —  Placebo 

Below  Median  Jak2  RQ  Above  Median  Jak2  RQ 


All  Patients  All  Patients 


p*0.79,n*16 

Months 

C  —  Epo  —  Placebo 


Below  Median  Hsp70  RQ  Above  Median  Hsp70  RQ 
All  Patients  All  Patients 


Months 

Figure  4.  Effects  of  exogenous  Epo  on  EPFS  with  stratification  by 
mRNA  status.  Outcomes  in  response  to  Epo  versus  placebo  are  shown 
in  Kaplan-Meier  plots.  The  log-rank  p  value  is  two  sided.  Compari¬ 
sons  of  outcomes  of  patients  randomized  to  Epo  are  indexed  below 
the  brackets.  (A):  Results  for  EpoR.  (B):  Results  for  .fakl.  (C): 
Re.sults  for  Hsp70.  Hsp70  mRNA  measurements  reflect  the  cumula¬ 
tive  expression  of  all  eight  family  members. 


the  cellular  origin  of  the  EpoR  transcript,  and  the  extent  to 
which  the  various  cell  types  within  tumors  might  respond  to 
Epo  and  contribute  to  tumor  progression  remains  undeter¬ 


mined.  Future  efforts  directed  at  laser  capture  microdissection 
of  various  cell  types  from  tumor  samples  will  help  resolve 
this  issue.  Wide  variations  in  Jak2,  Hsp70,  and  Csflrh  mRNA 
levels  were  also  found  across  our  series  of  breast  cancers  and 
head  and  neck  cancers. 

We  used  these  tools  to  hunt  for  an  association  between 
Epo  exposure,  a  tumor’s  inferred  competency  to  respond  to 
Epo  based  on  mRNA  levels  of  Epo-associated  signaling  mole¬ 
cules  and  patient  outcomes.  The  ideal  testing  grounds  for  this 
effort  are  the  archival  tumors  of  patients  who  were  random¬ 
ized  in  clinical  trials  of  Epo  versus  placebo  and  whose  out¬ 
comes  are  known.  For  this  first  study,  we  examined  available 
tumors  from  ENFIANCE  [3].  Above-median  levels  of  mRNA 
for  EpoR,  and  its  tethered  signaling  intermediate  Jak2, 
emerged  as  candidate  predictors  of  reduced  EPFS  in  unre¬ 
sected  patients  treated  with  Epo  compared  to  placebo.  In  con¬ 
trast,  we  found  no  significant  association  between  Csj2rh, 
KrtS,  or  Cd44  mRNA  levels  and  outcome.  Since  Hsp70  medi¬ 
ates  Epo  signaling  in  erythroid  cells  [21]  and  is  detected  by 
the  C20  antibody  [14],  we  predicted  correlations  between 
ElspJO  mRNA  levels,  Epo  and  LPFS  analogous  to  those 
observed  with  EpoR  and  Jak2.  To  the  contrary,  we  found  a 
strong  association  between  below-median  levels  of  all  Hsp70 
family  members,  Epo  treatment  and  poor  outcome.  Impor¬ 
tantly,  we  did  not  find  a  correlation  between  EpoR  mRNA 
levels  and  prior  staining  of  the  same  tumors  using  the  C20 
antibody  [13].  These  findings  are  consistent  with  the  interpre¬ 
tation  that  the  C20  staining  does  not  correlate  with  EpoR 
expression  [14].  Confirmation  of  C20  staining  as  a  predictor 
of  tumor  susceptibility  to  Epo  may  lead  to  the  identification 
of  other  proteins  involved  in  Epo  responsiveness. 

Whether  these  tentative  associations  are  reflective  of 
underlying  tumor  biology  is  unknown,  and  the  interpretation 
of  our  findings  must  be  tempered  by  several  limitations.  First, 
contrasting  the  disparate  outcomes  of  subjects  randomized  to 
Epo  versus  placebo,  mRNA  levels  were  not  associated  with 
significant  differences  in  LPFS  when  restricting  our  analysis 
to  patients  in  the  Epo-treated  group  (see  the  brackets  beneath 
the  Kaplan  Meier  plots  in  Fig.  4).  Thus,  differences  in  LPFS 
associated  with  above-  versus  below-median  mRNA  levels 
cannot  be  accounted  for  entirely  by  differences  in  outcomes 
in  response  to  Epo.  Similarly,  we  found  no  association 
between  these  markers  and  adverse  outcome  in  the  presence 
of  elevated  levels  of  endogenous  Epo  in  patients  enrolled  in 
the  placebo  arm  of  ENHANCE.  Second,  the  statistically  sig¬ 
nificant  associations  that  we  observed  were  confined  to 
patients  with  unresected  tumors,  and  did  not  extend  to 
patients  with  incomplete  or  complete  resection  of  their 
tumors.  This  discrepancy  might  be  explained  by  the  prediction 
that,  absent  resection,  a  larger  amount  of  tumor  would  be 
available  for  Epo  stimulation.  In  support  of  this  interpretation, 
the  original  ENHANCE  trial  also  did  not  find  an  association 
between  Epo  treatment  and  worse  outcomes  in  patients  with 
completely  resected  tumors  [3].  Third,  the  small  number  of 
patients  with  unresected  tumors  also  precludes  further  stratifi¬ 
cation  to  adjust  for  potential  baseline  imbalances  or  confound¬ 
ing  clinical  characteristics.  Fourth,  in  view  of  the  exploratory 
nature  of  our  hypothesis,  we  did  not  adjust  for  multiple  com¬ 
parisons.  This  increases  the  likelihood  that  the  observed  sig¬ 
nificant  p  values  represent  false  positives  [35].  Most  impor¬ 
tantly,  our  findings  are  constrained  by  lack  of  access  to 
additional  tumors  from  other  Phase  III  clinical  trials  of  Epo. 
Because  these  were  large  multicenter  trials  lacking  centralized 
tumor  repositories,  obstacles  to  obtaining  these  tumors  likely 
can  only  be  surmounted  by  the  trial  sponsors.  Tapping  this  lit¬ 
tle  explored  resource  using  the  methods  described  here  may 
bring  new  insight  to  Epo  and  cancer  progression. 
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